System For Individualized Customer Interaction

ABSTRACT

A method and system for using individualized customer models when operating a retail establishment is provided. The individualized customer models may be generated using statistical analysis of transaction data for the customer, thereby generating sub-models and attributes tailored to customer. The individualized customer models may be used in any aspect of a retail establishment&#39;s operations, ranging from supply chain management issues, inventory control, promotion planning (such as selecting parameters for a promotion or simulating results of a promotion), to customer interaction (such as providing a shopping list or providing individualized promotions).

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a continuation (and claims the benefit of priorityunder 35 USC 120) of U.S. application Ser. No. 13/099,424, filed May 3,2011, now allowed, which is a continuation of U.S. application Ser. No.11/069,472, filed Feb. 28, 2005, now U.S. Pat. No. 7,945,473, issued May17, 2011, which claims the benefit to U.S. Provisional Application Ser.No. 60/548,261, filed Feb. 27, 2004. The disclosures of the priorapplications are considered part of (and are incorporated by referencein) the disclosure of this application.

BACKGROUND

Retailers have been collecting large quantities of point-of-sale data inmany different industries. One area that has been particularly active interms of collecting this type of data is grocery retailing. Loyalty cardprograms at many grocery chains have resulted in the capture of millionsof transactions and purchases directly associated with the customersmaking them.

Despite this wealth of data, the perception in the grocery industry isthat this data has been of little use. The data collection systems havebeen in place for several years but systems to make sense of this dataand create actionable results have not been very successful. There havebeen efforts to utilize the retail transaction data. For example,research in mining association rules (R. Agrawal and R. Srikant, Fastalgorithms for mining association rules. In Proc. of 20th Int'lConference on Very Large Data Bases, Santiago, Chile, 1994) has led tomethods to optimize product assortments within a store by miningfrequent item-sets from basket data (T. Brijs, G. Swinnen, K. Vanhoof,and G. Wets, Using association rules for product assortment decisions: Acase study. In Knowledge Discovery and Data Mining, pages 254-260,1999). Customer segmentation has been used with basket analysis in thedirect marketing industry for many years to determine which customers tosend mailers to. Additionally, a line of research based on marketingtechniques developed by Ehrenberg (A. Ehrenberg, Repeat-Buying: Facts,Theory, and Applications, Charles Griffin & Company Limited, London,1988) seeks to use a purchase incidence model with anonymous data in acollaborative filtering setting (A. Geyer-Schulz, M. Hahsler, and M.Jahn, A customer purchase incidence model applied to recommendersystems, in WebKDD2001 Workshop, San Francisco, Calif., August 2001).

Traditionally, most of the data mining work using retail transactiondata has focused on approaches that use clustering or segmentationstrategies. Each customer is “profiled” based on other “similar”customers and placed in one (or more) clusters. This is usually done toovercome the data sparseness problem and results in systems that areable to overcome the variance in the shopping behaviors of individualcustomers, while losing precision on any one customer.

A major reason that individually targeted applications have not beenmore prominent in retail data mining research is that in the past therehas been no effective individualized channel to the customer for brick &mortar retailers. Direct mail is coarse-grained and not very effectiveas it requires the attention of customers at times when they are notshopping and may not be actively thinking about what they need. Couponbased initiatives given at checkout-time are seen as irrelevant as theycan only be delivered after the point of sale. Studies have shown thatgrocers lose out on potentially 11% of sales due to forgotten items,which highlights the need to find effective individual channels tocustomers at the point of sale prior to check out.

With the advent of PDA's and shopping cart mounted displays, such as themodel Symbol Technologies is piloting with a New England grocer,retailers are in a position now to deliver personalized information toeach customer at several points in the store. In fact, a few systemshave been developed and attempt to deliver personalized information tocustomers. For example, the IBM Easi-Order system allows a list to bedeveloped on a customer's PDA, which is then sent to the store to becompiled and picked up. (R. Bellamy, J. Brezin, W. Kellogg, and J.Richards, Designing an e-grocery application for a palm computer:Usability and interface issues, IEEE Communications, 8(4), 2001). In asystem developed at Georgia Tech, a PDA was used as a shopping aideduring a shopping trip to show locations and information on items in alist (E. Newcomb, T. Pashley, and J. Stasko, Mobile computing in theretail arena, in Proceedings of the conference on Human factors incomputing systems (CHI2003), pages 337-344. ACM Press, 2003). In each ofthe IBM and Georgia Tech systems, the shopping list was emphasized asthe essential artifact of a grocery trip, enabling all otherinteractions. Both also stated as a design goal that it should bepossible to compile or augment a shopping list per customer based onprevious purchase history. In another example, the 1:1 Pro system wasdesigned to produce individual profiles of customer behavior in the formof sets of association rules for each customer which could then berestricted by a human expert (G. Adomavicius and A. Tuzhilin, Using datamining methods to build customer profiles, IEEE Computer, 34(2):74-82,2001). Despite these efforts, there has not been a thorough experimentalattempt to predict and evaluate individually personalized customershopping lists from transactional data with a large set of customers.

Therefore, given the massive amounts of data presently being captured,and the imprecise predictive ability of clustering and segmentationapproaches, there is a need to better utilize the captured data, such asa better prediction of a shopping list. Likewise, there is a need for asystem to provide a predictive shopping list to customers using theconsumer models using reduced processor resources to be able to deliverthe lists locally on mobile processing devices attached to shoppingcarts. Also, there is a need to better utilize the captured data toprovide enhanced promotion and planning for retail establishments andothers in the supply chain.

BRIEF SUMMARY

The above needs may be satisfied by the present invention. In oneembodiment of the invention, a method and system is provided forindividualized communication for a customer includes a customer modelcreation component configured to create at least a part of the customermodel for the customer by statistically analyzing transaction datasolely from the customer, a customer identification component configuredto determine the identity of the customer, and customer communicationcomponent configured to access the customer model and the identity ofthe customer and to determine a content of a communication based on theat least a part of the customer model.

In a second embodiment of the invention, a method and system is providedthat includes a shopping list computing device configured to communicatewith a server, a customer identification component, a shopping listprediction component configured to generate a proposed shopping listbased on a statistical analysis of the transactional data associatedwith the customer, and a display component configured to display theproposed shopping list on the mobile computing device.

In a second embodiment of the invention, a product promotion method andsystem is provided that includes a customer identification component, acustomer model comprising a plurality of attributes derived fromtransaction data associated with the customer, a promotion predictioncomponent configured to select a product, an output device, and apromotion computing device configured to generate a promotion for theselected product based on the attributes in the customer model and totransmit the promotion to the output device, which may be a mobileoutput device.

In a third embodiment, a promotion planning method and system isprovided that includes a parameter selection component configured toselect parameters for a promotion by optimizing pre-determined goals ofthe promotion, a customer selection component communicating with theparameter selection component, the customer selection componentconfigured to select a subset of customers based on the selectedparameters, a promotion simulator component communicating with thecustomer selection component, the promotion simulator componentconfigured to simulate outcomes of the promotion with the selectedparameters and the subset of customers, and an output devicecommunicating with the promotion simulator component, the output deviceconfigured to present the simulated outcomes.

In a fourth embodiment, a promotion planning method and system isprovided that includes a parameter selection component configured toselect parameters for a promotion, a customer selection componentcommunicating with the parameter selection component, the customerselection component configured to select a subset of customers based onthe selected parameters and based on customer models, at least a portionof each customer model being derived from statistical analysis ofcustomer data consisting of transaction data associated with arespective customer, a promotion simulator component communicating withthe customer selection component, the promotion simulator componentconfigured to simulate outcomes of the promotion with the selectedparameters, the subset of customers, and the customer models of thesubset of customers, and an output device communicating with thepromotion simulator component, the output device configured to presentthe simulated outcomes.

In a fifth embodiment, an inventory planning method and system for aretail establishment is provided that includes a plurality of customermodels for customers of the retail establishment, at least a part ofeach customer model generated by statistical analysis of transactionaldata for a product category for a respective customer and a inventoryplanning component accessing the plurality of customer models, theinventory planning component configured to estimate purchases for theproduct category in a pre-determined period and configured to aggregatethe estimated purchases.

The foregoing summary has been provided only by way of introduction.Nothing in this section should be taken as a limitation on the followingclaims, which define the scope of the invention.

BRIEF DESCRIPTION OF SEVERAL VIEWS OF THE DRAWINGS

The invention can be better understood with reference to the followingdrawings and description. The components in the figures are notnecessarily to scale, emphasis instead being placed upon illustratingthe principles of the invention.

FIG. 1 is a block diagram of a customer model training system.

FIG. 2 is an expanded block diagram of customer model training model inthe customer model training system depicted in FIG. 1.

FIG. 3 is a block diagram of an individualized customer interactionsystem.

FIG. 4 is a block diagram of a shopping list prediction runtime moduledepicted in FIG. 3.

FIG. 5 is a block diagram of an individualized customer interactionmodule depicted in FIG. 3.

FIG. 6 is a block diagram of a promotion sensitivity runtime moduledepicted in FIG. 5.

FIG. 7 is a block diagram of an individualized customer interactionsystem for a grocery store.

FIG. 8 is a graph of results for a top N results method-customeraveraged.

FIG. 9 is a graph of results for a top N results method-transactionaveraged.

FIG. 10 is a graph of linear classifier performance at confidencethresholds for a customer averaged method-Winnow.

FIG. 11 is a graph of linear classifier performance at confidencethresholds for a customer averaged method-perception.

FIG. 12 is a block diagram of a promotion planning system.

FIG. 13 shows a projection screen illustrating optimization andpromotion simulation of the promotion planning system.

FIG. 14 shows a projection screen illustrating mechanism for viewingresults of past promotions.

DETAILED DESCRIPTION

Any party who offers goods or services may be considered a “retailestablishment.” Similarly, any party who purchases goods or services maybe considered a “customer.” Therefore, there are many different types of“retail establishments” and many types of “customers.” Examples ofretail establishments and customers include: (1) a retail store with thecustomers being its shoppers; (2) a wholesaler may be considered aretail establishment with the retailers who purchase goods from thewholesaler acting as customers; or (3) a manufacturer may be considereda retail establishment with the parties who purchase the manufacturedgoods (either retailers, wholesalers, or shoppers) acting as customers.These examples are merely for illustrative purposes.

In any of these retail establishment-customer relationships, the retailestablishment typically has many aspects to its operations, ranging fromsupply chain management issues, inventory control, promotion planning,to customer interaction (such as before, during, or after the retailexperience). The retail establishment may wish to improve any one, some,or all of these aspects of its business.

In order to improve on any aspect of its business, customer models maybe used. For example, customer models of an individual customer may bederived, at least in part, based on statistical analysis. Parts, or all,of the model may be derived from data solely from the individualcustomer, such as transaction data from the customer. Data from othercustomers, such as transaction data, need not be used in compilingparts, or all, of the customer model. The customer model may compriseone or more sub-models, such as a shopping list sub-model, or maycomprise one or more attributes, such as behavior, brand loyalty, walletshare, price sensitivity, promotion sensitivity, product substitution,basket variability, frequency of shopping, etc. The customer model maythus be used in any aspect of a retail establishment's operations,ranging from supply chain management issues, inventory control,promotion planning, to customer interaction (such as before, during, orafter the retail experience).

One application of the customer model is a shopping list predictionmethod and system. The shopping list prediction system may estimateproduct categories that a customer may purchase for a given shoppingtrip at a retail establishment, such as a grocery store. The estimatemay be based on the shopping list sub-model that is statisticallyderived from transaction data for the specific customer. For example,the shopping list sub-model may be generated using customer data solelyfrom the specific customer. Transaction data from other customers ormanual customer input may not be necessary in generating the shoppinglist sub-model. The customer data, such as customer transaction data,may use statistical analysis to estimate purchase of one, some, or allof the product categories of the retail establishment. Productcategories may include any grouping of products including a productclass, individual products, or specific types of individual products.And, one or more statistical analyses may be used in generating theshopping list sub-model, such as rule-based and machine learningstatistical analyses. The shopping list sub-model may be generated priorto the given shopping trip and updated with current parameters (such ascurrent date, time, etc.) or may be generated concurrently with thegiven shopping trip.

Another application of the customer model is a product promotion methodand system for a retail establishment, such as a grocery store. Aproduct category may be suggested for purchase, such as a suggestionfrom a shopping list system. Based on one or more attributes of thecustomer model, the product promotion system and method may determinewhether and/or what type of promotion a customer may receive for theproduct category. The attributes may be statistically derived fromsolely from a customer's transaction data and may include behavior,brand loyalty, wallet share, price sensitivity, promotion sensitivity,product substitution, basket variability, and frequency of shopping.Further, the product promotion method and system may bill for promotionsprovided to customers. The billing may be for an impression of thepromotion to the customer or may be for acceptance of the promotion.Moreover, the billing may be independent of or dependent on the customerwho is provided the promotion. For example, the billing may depend onone or more attributes in the customer model for the customer receivingthe impression or accepting the promotion. Or, the billing may depend onthe goals of the promotion, such as brand, revenue, lift, and marketshare. For example, the billing may depend on whether a brand switch ora brand extension has occurred.

Still another application of the customer model is a promotion planningmethod and system for a retail establishment, such as a grocery store. Apromotion may have stated goals and may have certain parameters.Examples of goals of a promotion may include brand, revenue, lift, andmarket share. Examples of parameters include duration of the promotion,type of promotion, amount of promotion, characteristics of customerstargeted, etc. The promotion planning method and system may select theparameters of the promotion to optimize, such as local or globaloptimization, of one or more of the stated goals. The promotion planningmethod and system may use the customer models in order to determinewhich subset of customers to select for the promotion based on theselected parameters. Further, the promotion planning method and systemmay simulate the promotion with the subset of customers and the selectedparameters using the customer models. The simulation may include anyone, some, or all of: number of expected visits; number of expectedimpressions; average number of impressions per switch; brand switchesbecause of the promotion; brand extensions because of the promotion; newtrials of the product; non-promotion volume; the promotion volume;promotion cost; discount per impression; the cost per switch; therevenues from the promotion; and incremental profit for a predeterminednumber of replenishment cycles due to the promotion. Based on the outputof the simulation, new parameters for the promotion may be selected, andthe simulation may iterate with the new parameters.

Another application of the customer model is an inventory planningmethod and system for a retail establishment, such as a grocery store.The inventory planning method and system may use the customer models forcustomers of the retail establishment in order to estimate purchases ofa product category for a predetermined period. The estimated purchasesof the product for the individual customers may be summed in order toprovide an estimate for the predetermined period. Moreover, thosecustomers of the retail establishment who do not have an individualcustomer model may be assigned an average customer model, therebyaccounting for all potential customers of a retail establishment. Theaverage customer model may use data from a plurality of customers, suchas transaction data for all customers for the product category or datafor a subset of customers for the product category.

I. Customer Interaction

A retail establishment may wish to interact with its customers in orderto meet any pre-defined criteria such as increased sales, increasedprofit, improved service, etc., as discussed in more detail below. Onemethod to improve the interaction between the retail establishment andthe customer is to generate individual customer profiles; and to use theprofiles for various aspects of the retail establishment's operations.

An example of a type of retail establishment is a grocery store. Agrocery store has several aspects to its operations, ranging from supplychain management issues, inventory control of the items it sells,promotion planning, to customer interaction (such as before, during, orafter the sale). Focusing on customer interaction, for example, grocerystores have attempted to interact with customers for a variety ofreasons, such as increasing sales, profit margin, customer loyalty, etc.In order to interact more effectively with the consumer, the grocerystore may provide individualized and personalized interactions withcustomers before the customer enters the store, during shopping as thecustomer navigates through the store, and after the customer leaves thestore. Instead of using traditional approaches, which often fail to beadequately personalized to the individual, one manner to communicatewith the customer is to generate an individualized customer model,learning separate classifiers for each customer based on historicaltransactional data. For example, the transactional data from loyaltycard programs in grocery stores may be used to create attributes of acustomer model, as discussed in more detail below.

The individualized and personalized interaction may take a variety offorms. One such form is generating a shopping list for the customer.Customers often fail to generate a list for grocery shopping, or if theydo, the list may be incomplete. Moreover, prior systems require customerinput of selecting items in order to generate the list. In contrast toprior systems, the customer may be presented with a suggested list ofitems that is based, at least in part, on statistical analysis of thetransactional data. The statistical analysis of the transactional datamay generate a predicted shopping list for any product category, such asa predicted shopping list for a product class (such as a prediction foryogurt, milk, or eggs), individual products (such as Dannon® yogurt), orspecific types of individual products (such as Dannon® 10 ouncestrawberry yogurt). The statistical analysis may use models,classifiers, predictors, or the like using the customer's transactionaldata to generate a predicted shopping list. Moreover, the statisticalanalysis may be updated every time additional transactional data for thecustomer is generated. Thus, the shopping list does not require thecustomer to tag certain items to compile a shopping list. Rather, theshopping list may be derived from (or may be generated solely based on)the transactional data for the customer.

The predicted shopping list benefits the customer in several ways.First, the grocery store provides a valuable service to the customer.Second, by suggesting a realistic shopping list, the customer isreminded of purchases he or she might have otherwise forgotten. Thesesuggestions translate into recovered revenues for the store that mightotherwise be transferred to a competitor, or foregone as the customergoes without the item until the next shopping trip. Third, because thelist of items is available, promotions may also be provided to thecustomer related to the list of items (such as a discount to buy alarger size of an item or a different brand of the item). A promotionmay be any customer communication designed to promote a sale relating toan item, such as an advertisement for the item, a discount for the item,an advertisement for a related item (such as a substitute product forthe item, a brand extension, etc.), a discount for the related item,etc.

Another form of personalized interaction is determining the shoppinghabits of the customer, and using the shopping habits to better interactwith the customer. A few examples of shopping habits, discussed below,include promotion sensitivity, basket variability, price sensitivity,brand loyalty, and wallet share. The listed shopping habits are merelyfor illustrative purposes. Other shopping habits are also available.Further, the shopping habits may be considered “global” (affecting allitems purchased by the customer), may be for a general product (such aspromotion sensitivity for milk), or may be for a specific product (suchas loyalty to a specific brand of milk).

To individualize and personalize the interaction with the customer, theretail establishment may generate a customer model that is specific to aparticular customer. A part, or all, of the customer model may usestatistics based on customer data solely from the specific customer (andnot from customer data from other customers). This is unlike thestatistics used in previous systems, such as clustering or segmentationtechniques, which used customer data from other customers.

There are several contexts where the customer data for the specificcustomer is sufficient to generate a part, of all, of the model. Onecontext is a grocery store, which often records transactions withcustomers, including data regarding the date of the visit to the store,the items purchased, the price paid, etc. Technology, such as customerrelationship management (CRM) technology, has allowed providers tocollect large quantities of point-of-sale (POS) data in many differentindustries. Grocery stores often use loyalty card programs to captureinformation about millions of transactions and purchases, where suchinformation may be associated with the customers making the transactionsand purchases. As discussed below, a training system may use thecustomer data for a specific customer to create a customer model that isindividualized to a particular customer. A runtime system may then usethe customer model prior to, during, or after shopping in a variety ofways, such as generating shopping lists or providing promotions, inorder to individualize and personalize the interaction with thecustomer.

A. Training System for Customer Model

In the drawings where like reference numerals refer to like elements,FIG. 1 is a block diagram of a customer model training system 100. Thetraining system 100 may use a training transactional database 110 thatcontains historical shopping data for a plurality of customers. Thehistorical shopping data may include: name of the customer, address,dates and times of shopping events, items purchase and price paid forshopping events, promotional offers received (including offers acceptedand rejected), etc. Other historical data may be included in thetraining transactional database 110. Though the training transactionaldatabase 110 is depicted as one block, the data may be resident in asingle database or may reside in multiple databases. The training systemfurther includes a computing environment 120. The computing environment120 may comprise a general purpose computing device which performsarithmetic, logic and/or control operations. As shown in FIG. 1, thecomputing environment 120 includes a customer model training module 130and a customer models database 140. The customer model training module130 may receive data from the training transactional database 110, andmay compile attributes of a specific customer to store in the customermodels database 130.

As discussed in more detail below, the customer model for a specificcustomer may be composed of sub-models or attributes of the user. Theattributes may be derived in a variety of ways, such as by storing datafrom the training transactional database 110 unmodified or by performingtransformations on the data (such as via statistical analysis) to deriveattributes which are specific to the individual customer. For example,attributes may comprise identification information (e.g., name, age,postal address, telephone number, e-mail address, etc.) and may comprisederived statistical information (such as attributes directed tobehavior, brand loyalty, wallet share, price sensitivity, promotionsensitivity, product substitution, basket variability, frequency ofshopping, etc.). Moreover, the customer model may comprise sub-models,classifiers, or predictors for a predicted shopping list for thecustomer. As discussed above, the attributes of the model may be globalto the shopping habits of an individual customer (such as basketvariability), may be for a general product (such as hoarding of milk),or may be for a specific product (such as loyalty to a specific brandyogurt).

Referring to FIG. 2, there is show an expanded block diagram of customermodel training module 120. The customer model training module 130 maycreate attributes for a model of a specific customer by performingvarious operations on the data received from the training transactionaldatabase 110, such as storing the data unmodified and creating newattributes or sub-models derived from the data (such as based onstatistical analysis of the data). FIG. 2 shows a series of moduleswhich may be performed in creating a customer model. Though a specificsequence for execution of the modules is shown, the modules may beexecuted in any sequence.

A non-derived attributes module 210 may be executed, as shown in FIG. 2.The non-derived attributes module 210 may generate attributes of thecustomer model which do not require any conversion of the content of thedata. The non-derived attributes 210 may include customer identificationinformation, such as the customer's name, address, telephone number,e-mail address, etc. This data may be included in the trainingtransactional database 110, and may be stored in unmodified form in thecustomer model. Moreover, the customer model training module 130 mayfurther execute derived customer model module 220. This module mayderive sub-models or attributes from the raw data from the trainingtransactional database 110. One, some, or all of the sub-models orattributes derived from statistical analysis may be based solely oncustomer data for the specific individual and/or may be without anymanual or explicit input from the customer. Further, the customer modeltraining module 120 may be updated at any time, including after any oneor all shopping trips. Thus, after runtime, a new set of transactionsmay occur. Based on these transactions, the customer model may beupdated.

i. Shopping List Sub-Model of Customer Model

One such sub-model of the customer model is a shopping list sub-model,which may be generated by the shopping list training module 222. Theshopping list training module 222 may generate a sub-model that includesclassifiers or predictors (such as a statistical probability) that thecustomer will purchase one, some or all product categories offered forsale by the retail establishment. As discussed above, the productcategories may be any grouping of the products offered for sale by theretail establishment, such as a product class, a specific product, or anindividual product brand. Therefore, for any customer with sufficienttransactional data, a classifier or predictor for some, one or all ofthe product categories offered for sale by the retail establishment.

As discussed in more detail below, he training module may usemethodologies to analyze the transactional data for the customer. Forexample, each shopping trip may include certain characteristics, such asthe day, date, and time of the shopping trip, and whether the productcategory was purchased or not purchased. The various shopping trips maybe analyzed to derive a function using the methodologies describedbelow, with the inputs to the function being, for example, the day,date, and time of the shopping trip, and the output being theprobability that the product category may be purchased. At runtime, suchas when the customer enters the store, the function may be accessed topredict the probability that the customer will wish to purchase theproduct category. Inputs to the function may be the day, date and timewhen the customer enters the store, and the output may be theprobability of purchase by the customer.

For example, the shopping list training module 222 may define a set ofcustomers “C”, a set of transactions “T” made by those customers, and afixed set of product categories “P” acquired by those customers. Theproduct categories “P” may be equivalent to those normally used onshopping lists and may be all of the products (of a subset of theproducts) available for sale at the retail establishment. Within T andP, each individual customer “c” that is included in the set of customers(each cεC) has associated with it a set of transactions made by thatindividual customer “T_(c)” (where T_(c) ⊂T) and a set of productcategories acquired by that individual customer “P_(c)” (where P_(c)⊂P). For each transaction made by an individual customer c, “t” (wheretεT_(c)), the shopping list training module 222 may define a sub-modelfor the shopping list. The sub-model may then be used at a later time topredict whether that individual customer c will purchase a particularproduct category p_(i) (where p_(i)εP_(c)) by creating a vector ofclassifiers yε{0,1}^(|P) ^(c) ^(|) (the “prediction vector”) where agiven classifier y_(i)=1 if, for a given order of all product categoriesin P_(c), customer c bought p_(i)εP_(c) in transaction t, and wherey_(i)=0 if customer c did not buy p_(i). Therefore, the shopping listprediction module 110 may formulate the classification of productcategories for all customers as |P_(c)| binary classifications for eachcustomer, and may derive a separate classifier for each classification.

As discussed above, the shopping list training module 222 may includeone or more methodologies for determining the sub-model of the traininglist. For example, one or more methodologies may predict the probabilitythat a customer may purchase any product category. Two types ofmethodologies include rule-based methodologies and machine learningmethodologies. These types of statistical analyses are merely forillustrative purposes. Examples of rule-based methodologies includerandom rule-based, same as last trip rule-based, and top N rule-based.

The random rule-based methodology includes random guessing. Using thismethod, for each transaction of a given customer, attributes may berelated to a prediction vector y′ that includes one or more classifiers.Each classifier y′_(i) may be equal to one of two values, such as 0 or1, with an equal probability. Products to which a classifier, such asy′_(i)=0, is associated are not included in the shopping list.Similarly, products to which a classifier, such as y′_(i)=1, isassociated are included in the shopping list. Therefore, as one example,every product class previously purchased by a particular customer has a50% chance of being included in the shopping list for the nexttransaction by that particular customer.

The same-as-last-trip rule-based methodology (also referred to as the“same-as-last-trip predictor”), may produce a model for shopping listthat consists of product categories, such as product classes, acquiredduring a previous transaction. An ordering on the set T_(c) may beimposed for each customer c corresponding to the temporal sequence ofeach transaction. Then, for each transaction t_(k), a prediction vectory′ is output equal to the purchase vector seen for transaction t_(k-1).

The top-n rule-based methodology may aggregate all the transactions of aparticular customer c, and selects and includes in the shopping list thetop n product categories. The rule-based prediction module 212 may rankthe product categories according to the quantity of and/or the frequencywith which the product categories were acquired list. category

For example, if the product categories are ranked according to frequencyof acquisition, a new ordering on the set P_(c) is defined for aparticular customer c, which corresponds to the frequency with whicheach product category is acquired (“freq(p_(i))”) within T_(c).Specifically, for each product category purchased by the customer (eachp_(i)εP_(c)), the frequency with which a given product category isacquired freq(p_(i)) may be defined by the following equation:

$\begin{matrix}{{{freq}\left( p_{i} \right)} = \frac{\sum\limits_{j = 1}^{T_{c}}y_{i}^{j}}{T_{c}}} & (1)\end{matrix}$

Therefore, in this example, the top-n rule-based methodology producesfor each transaction t a vector y′ for which the values corresponding tothe top n groupings in P_(c), as ordered by frequency, are equal to 1,and with all else, equal to 0. A variation on the same-as-last-triprule-based methodology comprises using only the past in transactions tocreate the Top N list which may account for some of the temporal changesa customer might exhibit.

As discussed above, another type of methodology is machine learning.There are several examples of machine learning methodologies, such asdecision-tree based and linear based methodologies. The examples ofmachine learning methodologies are merely for illustrative purposes.Other types of machine learning methodologies are possible. In contrastto rule-based prediction, the machine learning determines the |P_(c)|binary classifications using a machine learning technique, such assupervised learning. To determine the groupings, the machine learningmethodology pairs each customer c_(i) with each product category p_(i)(where p_(i)εP_(c)) to form classes, where each class may be thought ofas a customer and product category pair. Therefore, if the availabledata set includes n customers and q product categories, the machinelearning methodology creates n*q classes, and as many binary classifiers(each “y_(i)”).

For each of the binary classifiers y_(i), a classifier may be trained inthe supervised learning paradigm to predict whether that category willbe bought by that customer in that particular transaction. The followingare a series of examples (x, y_(i)), where x is a vector in R^(n) forsome n, encoding features of a transaction t, with y_(i)ε{0, 1}representing the label for each example (i.e., whether the categorycorresponding to y_(i) was bought or not).

As discussed above, there are several machine learning methodologies.One machine learning methodology may use decision trees to predict eachclass label, such as C4.5 (see J. R. Quinlan, C4.5: Programs For MachineLearning, Morgan Kaufmann Publishers Inc., San Francisco, Calif., USA1993). Another machine learning methodology may use linear methods tolearn each class, such as Perceptron, Winno, Naive Bayes, LinearDiscriminant Analysis, Logistic Regression, Separating Hyperplanes.These linear methods offer several advantages in a real-world setting,most notably the quick evaluation of generated hypotheses and theirability to be trained in an on-line fashion.

In each case, a feature extraction step precedes the learning phase.Information about each transaction t is encoded as a vector in R^(n).For each transaction, included are properties of the current visit tothe store and information about the local history before that date interms of data about the previous 4 transactions. An assumption is thatexamples and their labels are not independent, and that one can modelthis dependence implicitly by including information about the previousvisits. This approach is similar to Natural Language Processing fortasks such as part-of-speech tagging, where tags of preceding words areused as features to predict the current tag. The analysis using the 4previous transactions is merely for illustrative purposes. Fewer orgreater transactions may be used.

The features extracted in example (x_(j), y_(i) ^(j)) for a giventransaction t^(j) (the “base features”) may include, for example, anycombination of the following: the replenishment interval at t^(j); thefrequency of interval at t^(j); the range into which the currentacquisition falls; the day of the week of the current shopping trip; thetime of the day for the current transaction, which may be broken down,for example, into six four-hour blocks; the month of the year for thecurrent transaction; and the quarter of the year for the currenttransaction.

The replenishment interval at t^(j) may include the number of days att^(j) since a product category p_(i) was acquired. The frequency ofinterval at t^(j) may be obtained by, for each product category p_(i),by building a frequency histogram for the interval at acquisition binnedinto several ranges (for example, 3-5 days, 7-9 days), and normalizingthe frequency histogram by the total number of times the productcategory was acquired. The range into which the current acquisitionfalls may be the same as the ranges indicated for the frequency ofinterval at t^(j).

For each transaction t, in addition to encoding features of a currenttransaction, traits from prior transactions (the historical transactiondata) may be extracted. These traits from prior transactions may beincluded in (x^(j); y^(j) _(i)). For example, the features of four (4)previous transactions t^(j-1); t^(j-2); t^(j-3); t^(j-4), may beincluded. Additionally, four features may be included with respect toeach transaction in the local history including: (1) whether categoryp_(i) was bought in this transaction; (2) the total amount spent in thistransaction; (3) the total number of items bought in this transaction;and (4) the total discount received in this transaction. These fourfeatures are only used for the local history of the current transactionand not for the current transaction itself. As discussed below withrespect to the runtime module, at runtime these four features are notavailable.

If the decision tree methodology is used, the features extracted such asthose discussed above, may make up the entire set of features used fortraining. If the linear methodology is used, it is often difficult tolearn a linear separator function using a relatively low-dimensionalfeature space such as that created by the extracted features. Therefore,in addition to extracting features the features discussed above,additional features are created to improve learnability. In addition,basic attributes from the local history may also be combined to increasethe number of features for prior transactions. The basic attributes maybe combined according to a non-linear transformation.

Creating the additional attributes effectively increases thedimensionality of each example vector x, and thus the chance of learninga linear function that separates the positive and negative examples.This method is similar to those used to learn classifiers in NaturalLanguage Processing contexts where combinations of words such asbi-grams and tri-grams are used as features in addition to the basicwords.

For each numbered feature type above, one may combine it with those ofthe same type in the customer's previous transactions, such as theprevious four transactions (local history). For example, feature 4 (dayof the week for the current transaction) may be combined with feature 4of the previous transaction to produce a new feature. For set-valuedattribute types, such as the day of the week of the current shoppingtrip, Boolean features may be instantiated for each value, for example,in this case, one attribute per day. The combination of these featuresused may be simple Boolean conjunctions. For the feature typescorresponding to continuous valued attributes such as the frequency ofthe interval, a single real valued feature may be created. To createcombinations of these features, one may use a non-linear transformation.In contrast, for the attribute types corresponding to continuous valuedattributes, such as the range into which the current acquisition falls,a single real-valued attribute may be created.

Using the sub-model of the shopping list, a predicted shopping list maybe generated at any point, such as before the customer enters the storeor when the customer enters the store. As discussed in more detailbelow, the current context, such as the day, date, and/or time, may beused with the sub-model of the shopping list to generate statisticalprobabilities that the customer may wish to purchase any productcategory, such as a product class, individual products, or specifictypes of individual products. The statistical probabilities may then beused to output certain product categories to the customer as a predictedshopping list.

ii. Behavior Analysis Attribute of Customer Model

One attribute of the customer model is a behavior analysis attribute(s),which may be generated by the behavior analysis training module 224. Thebehavior analysis training module 224 may analyze the data from thetraining transactional database 110 and derive shopping behaviorpatterns of a particular customer. The shopping behavior patterns mayrelate to characteristics about a customer based on the productcategories that customer acquires. The behavior analysis module 120 maydetermine characteristics, such as those relating to lifestyle andbehavior, by determining a ratio of the product categories acquired by aparticular customer to the product categories acquired by all othercustomers. Depending on the behavior and the data available this may bedone on a product-by-product basis, or may be done on an aggregate setof products. Examples of product categories includeLIFESTYLE_ELECTRONICS, PETS_DOGS, PETS_CATS, PETS_OTHER, FAMILY_KIDS,FAMILY_TEEN, RELIGION_JEWISH, RELIGION_MUSLIM, FOOD_ORGANIC,FOOD_NEW_AGE, etc. Products previously purchased may be placed in anyone or multiple categories. Two scores may then be calculated for eachcustomer C and each category T to define the behavior of a particularcustomer. The scores may be based on the amount of money spent or on theamount of items purchased. For example, the first score C₁ may be themoney spent on products from category T by customer C divided by thetotal spent by customer C. The second score C₂ may be the average (C₁)for all customers who buy at least one product from category T. ASymmetric Ratio Spend Score may then be derived for customer C, categoryT as C₁/C₂ if C₁>C₂, or −C₂/C₁. As another example, the scores may bebased on the amount of items purchased in a particular category ratherthan on the amount of money spent on a category. In particular, C₁ maybe the number of products bought from category T by customer C dividedby the total number of products bought by customer C. These examples ofcustomer's behaviors are merely used as examples. Other shoppingbehaviors may be derived using behavior analysis training module 224.

iii. Brand Loyalty Attribute of Customer Model

Another attribute of the customer model is a brand loyalty, which may begenerated by the brand loyalty training module 226. Customers typicallyhave a propensity to choose a specific brand given the availability ofthat brand for a product category T, as well as across productcategories. The degree of brand loyalty may be subsequently used to moreeffectively offer promotions. As discussed in more detail below, brandloyalty may be used to determine whether it is reasonable to try toinduce a brand switch or whether trying to stretch the brand to otherproduct categories is more appropriate. Brand loyalty may also be usedto offer customer packaged goods companies promotions based on brandusage.

The brand loyalty attribute(s) may comprise a brand loyalty score forevery customer, product category, brand, etc. The scores may then beaggregated in any manner. For example, the brand loyalty for all CocaCola® products (which may include Coca Cola®, Sprite®, Tab®, etc.) maybe compiled. The brand loyalty scores may be created by the brandloyalty training module 224 in a variety of ways for everycustomer-product category pair. For example, brand loyalty may becalculated for a customer-product category as the number of brandsbought by the customer in a particular product category divided by thetotal brands available in the particular category. Alternatively, thebrand loyalty may be calculated similar to the previous example, exceptthat the score may be modified based on the popularity of the brand(e.g., brands that are popular receive a lower score and brands that arenot very popular receive a higher score). Still another brand loyaltyscore may derive the premium that is being paid by the customer for thebrand that he or she is loyal to. If the customer is loyal to thecheapest brand, the brand loyalty score may be reduced. If the customeris loyal to the most expensive brand, the brand loyalty score may beincreased.

iv. Wallet Share Attribute of Customer Model

Another attribute of the customer model is a wallet share, which may begenerated by the wallet share training module 228. Customers tend to usedifferent retailers for different categories of goods. The wallet sharetraining module 228 may examine the broad categories for which acustomer tends to use a particular retailer, and the proportion of thecustomer's spending that the retailer is receiving for these categories.For example, does the customer ever use a particular grocery store forbakery goods, personal hygiene, magazines, toys, electronics, etc. Ifthe customer does the grocery store to purchase products in a broadcategory, to what extent. As discussed below, the wallet share attributemay be used when determining whether and/or what type of promotions tooffer a customer. For example, a promotion may be offered for a productin a category not typically purchased by the customer from thisretailer.

v. Price Sensitivity Attribute of Customer Model

Another attribute of the customer model is a price sensitivity, whichmay be generated by the price sensitivity training module 230. The pricesensitivity training module 230 may measure how sensitive a specificcustomer is to prices by binning the customer's purchases at differentlevels. When the data is sparse for a specific product, one mayaggregate to the product category level. In addition, comparisons may bemade across different customers by calculating the percentile price thecustomers typically pay for a particular product. As discussed in moredetail below, knowledge of price sensitivity enables one to restrictpromotions to those who need the additional inducement to trigger apurchase.

There may be different levels of detail in determining pricesensitivity, such as at the individual level and at the cluster level.At the individual level, price sensitivities for each customer may bederived with respect to each product. And, shrinkage-like techniques maybe used to smooth these estimates. The output of the derivations maycomprise a tree of price sensitivities for each customer. The estimatesat the leaf nodes may be determined in the following way: given customerC, product P, calculate pairs (R_(i)εR, P(R_(i))) where R is the set ofall unique prices for product P during all of customer C visits, and

P(R_(i))=(number of times customer C visited the store and boughtproduct P at price R)/(number of times customer C visited the store andprice of product P was R).

Given pairs (R_(i), P(R_(i))), a least squares fit may be performed toobtain a linear equation relating P_(i) and P(R_(i)). The slope of thatline may be the price sensitivity and the R₂ is the confidence. Theseindividual price sensitivities may be aggregated and used to calculateprice sensitivities at sub-category and category levels.

At the cluster level, (R_(i), P(R_(i))) may define a probabilitydistribution for each customer and product. By clustering customers thathave similar price sensitivities for one or more products, one can groupthem together to create more robust statistics. R_(cpi) is the ith pricefor product P and customer C.

vi. Promotion Sensitivity Attribute of Customer Model

Another attribute of the customer model is a promotion sensitivity,which may be generated by the promotion sensitivity training module 232.The promotion sensitivity training module 232 may provide a measure of acustomer's response to a promotion, such as a sale, or coupon offering.The promotion sensitivity training module 232 may determine variousaspects of promotion sensitivity, such as hoarding and price efficiency.

The promotion sensitivity training module 232 may assess an individualcustomer's responses to promotions on a product by product basis. Thereare various measures of a customer's response to promotions including:(1) hoarding; (2) price efficiency; (3) opportunistic index; (4) couponindex; and (5) sales ratio. These measures are merely for illustrativepurposes, and other measures of a customer's response to promotions areavailable.

With regard to hoarding, the promotion sensitivity training module 232may determine whether a customer hoards product categories during asale, and if the customer hoards, whether the hoarding is of a type thatis to be encouraged. Hoarding, or acquiring a greater number of aparticular product category during a sale, is a common customerbehavior. In some cases, a customer will purchase more of a particularproduct category during a sale than they would normally, but fewer afterthe sale. However, if the total amount spent on the particular productcategory during the sale and after the sale is greater than it wouldhave been over the same time period if the sale had not occurred, thecustomer is considered a “good hoarding.” However, if the total amountspent on the particular product category during the sale and after thesale is less than it would have been over the same time period if thesale had not occurred, the customer is considered a “bad hoarder.” Thepromotion sensitivity training module 232 treats a “neutral hoarder,”which includes customers that do not change their acquisition behavioras a result of a sale, as a subcategory of bad hoarders because suchcustomers benefit from the sale even though they are not sensitive toit.

To determine whether a customer is a “good hoarder” or a “bad hoarder,”with respect to a particular product category, the promotion sensitivitytraining module 232 may examine a customer's acquisition behavior withrespect to a particular product category during three time periods:pre-sale, sale, and post-sale. Generally, it can be assumed that theduration of the sale period is the same for all customers. However, thepre-sale and post-sale periods may differ among the customers because itis based on the individual replenishment rates of each customer for thatparticular product category. Therefore, if a customer acquires a productcategory less often, the replenishment rate is lower, and thus the preand post sale periods are made longer. The pre and post sale periods mayor may not be equal in duration.

The promotion sensitivity training module 232 may determine whether acustomer is a good or bad hoarder by examining transaction data over apredetermined time period, such as three (3) months, or a predeterminednumber of replenishment cycles, such as 6. If a new sale occurs duringthe post or pre sale period, the promotion sensitivity training module232 may shorten the post or pre sale period, respectively, accordingly.

One way to identify bad hoarders is to compare the total revenue for thesale and the post sale period with that of the pre sale period. Thismethod is useful for identifying total revenue lost. Alternatively, onemay ignore the sale subject to the promotion, and merely focus oncustomers that spend less after the sale then they did before as ameasure to identify bad hoarders. The latter method, while lessconservative, gives a measure of whether customers increased theirconsumption of a product or did they simply store it at home for a rainyday. If loading up as opposed to increased consumption occurred, thepost sale period can be further divided into the reserve period and theresumed consumption period. Those periods may be identified by comparingreplenishment rates with the rule-based replenishment rate.

As discussed subsequently, determining hoarding behavior is beneficialin determining whether to provide a promotion and/or the type ofpromotion. By identifying bad hoarders as those who spend less duringthe sum of the sale period and the post period than they did during thepre sale period, one may calculate the amount of revenue lost for thestore, and determine whether this detriment outweighs the benefits ofproviding the promotion. One may also predict future behavior ofcustomers by looking at the previous sales data and advise the storemanagement which customers should be receiving promotions.

The promotion sensitivity training module 232 may provide a measure ofpromotion sensitivity in terms of a sensitivity index, and/or one ormore price efficiency indices. The sensitivity index represents thepercentage of change in the quantity of a particular product categoryacquired during a sale over or under that acquired when there is nosale. The promotion sensitivity training module 232 may determine thesensitivity index for individual customers as well as individualproducts and product categories.

With regard to price efficiency, the promotion sensitivity trainingmodule 232 may also provide a comparison of the sale behavior of anindividual customer to that of other customers (the “price efficiencyindices”). The price efficiency indices may include an opportunisticindex and/or a coupon index. These indices may provide a measure of howsavvy a customer is by examining how much that customer actually paysfor particular product categories.

The opportunistic index may measure the average difference between theprice the customer paid and the most common price of a product category.The common price of a product category may be determined over any timeperiod, such as the most frequently occurring daily price over the last2 years. The opportunistic index includes the effects of promotions andpermanent price changes. From the point of view of the customer, anegative opportunistic index is good. For example, a customer who shopsmore often during sales will have a highly negative opportunistic index.However, the customer will get points for getting a lower price than themode even if the product is not on sale. This will include permanentprice drops, coupons, etc.

The coupon index may measure the difference between the price thecustomer paid and the price paid by most people the day of the purchase(mode of the day). This provides a measure of how much of an individualprice a customer receives. More than just a measure of sales, thisprovides a measure of whether the customer prefers to beat the pricethat others are paying that day. As discussed subsequently, this measuremay be useful for analyzing individual promotions such as coupons(unless they are very popular coupons that most people use during aday). From the point of view of the customer, a negative value istypically good.

The promotion sensitivity training module 232 may also determine a salesratio. The sales ratio is the ratio of the number of items in a productcategory acquired during a sale to total the number of productspurchased. It is useful for analyzing the effects of advertised sales. Apositive sales ratio indicates an effective sale.

vii. Product Substitution Attribute of Customer Model

Another attribute of the customer model is a product substitution, whichmay be generated by the product substitution training module 234. Theproduct substitution training module 234 may identify a product orproducts that are substitutes for one or more products on a shoppinglist. Alternately, or in addition, the product substitution trainingmodule 234 may identify product categories on the shopping list that aresubstitutes for each other. As discussed below, this information may beused at runtime, wherein the shopping list runtime module may remove oneor more of the substitute product categories from the shopping list.

The product substitution module 324 may determine product categorysubstitutions at multiple levels, such as store-level substitutes andcustomer-level substitutes. For the same product category, thesubstitutes may not be the same at the store and customer levels. Forexample, Coke® and Diet Coke® may be substitutes for one another at thestore level, but for a particular customer, Coke® and Diet Coke® may notbe substitutes.

The product category substitution module 324 may determine substitutesaccording to the following:

For items i and j, Calculate P(i), the probability of buying item i,P(j), the probability of buying item j, and P(i, j), the probability ofbuying both items i and j.

C(i, j)=0 if i and j are in different categories, and 1 if they are inthe same category. If P(i, j)<P(i)*P(j) and C(i, j)=1, then i and j maybe considered substitutes. Therefore, substitutes may be determinedusing a score for each item pair, i and j, that measures the degree towhich they can be substituted for each other. This score can becalculated in a variety of ways. As discussed above, the score is 0 ifthe items i and j are not in the same category and equals P(i,j)/(P(i)*P(j)) if the items are in the same category. If i and j are inthe same category and always bought together, then the items are notsubstitutes. If the items are in the same category and are rarely (ornever bought together), then they may receive a high substitution score.

viii. Basket Variability Attribute of Customer Model

Another attribute of the customer model is a basket variability, whichmay be generated by the basket variability training module 236. Basketvariability measures the variance of a particular customer's totalspending from one shopping visit to the next. In other words, basketvariability is an indicator of how much a given customer's totalspending during a visit tends to vary from visit to visit. If thecustomer has a high variability (i.e., the variance is significantlygreater than average so that the customer does not have a set amount ofspending from one visit to the next), one may offer promotions intendedto grow basket size. The basket variability may be determined in avariety of ways. For example, basket variability may be determined asthe distance of the customer's basket distribution from a uniformdistribution using a mean-squared error distance or Kulback-Lieblerdivergence. In particular, the basket variability training module 236may determine for different values of X and Y, the percentage of times,X % of their shopping baskets (in terms of total spent) were within Ydollars of each other. Any values of X and Y may be selected.

As discussed in more detail below, a promotion may be offered to growthe basket size (e.g., 3 for the price of 2, etc.) if the customer'sbasket size varies. If the customer has a low variability (i.e., thevariance is significantly lower than average so that the customer spendsa set amount from one visit to the next), one may offer promotionsintended to maximize margin. As discussed in more detail below, acustomer may be offered a promotion for a product which is a highermargin for the retail establishment (e.g., 10% off a high margin brand).

ix. Shopping Trip Frequency Attribute of Customer Model

Another attribute of the customer model is a shopping trip frequency,which may be generated by the shopping trip frequency training module238. Shopping trip frequency relates to the frequency or timing ofshopping trips. For example, the data relating to timing of previousshopping purchases in the training transactional database 110 may beanalyzed to derive an attribute relating to the frequency of shoppingtrips. Specifically, the dates of the last “x” number of shopping tripsmay be analyzed to determine an average time between the shopping trips,a particular shopping day of the week (such as Sunday) and/or theparticular shopping time of the day (such as in the morning). Asdiscussed subsequently, these shopping trip frequency attributes may beused to determine whether and what type of promotions should be offeredto a particular customer.

B. Runtime System Using Customer Model

As discussed above, the customer models may be used to improve anyaspect of a retail establishment's operations. The customer models maybe generated, and when needed, accessed at any time using a runtimesystem. In the context of a grocery store, the customer model may beaccessed before, during, and after a customer shops at the grocerystore. Moreover, one, some, or all of attributes of the customer modelmay be accessed during runtime. For example, the shopping list sub-modelof the customer model may be accessed when the customer arrives at thegrocery store, generating a predicted list of items for the customer.This predicted list may then be used by other attributes of the customermodel in order to offer promotions to the customer. Alternatively, onlya predicted shopping list may be generated without any promotionsrelated to any of the predicted items on the list. Or, promotions may begenerated for items not related to a predicted shopping list.

FIG. 3 is a block diagram of one example of a runtime system 300 usingthe customer models wherein a shopping list is predicted (using shoppinglist prediction runtime module 320) and wherein promotions are providedrelated to items on the predicted shopping list (using individualizedapplication runtime module 330). The runtime system 300 may furtherinclude a customer interface system 340. The runtime system 300 may usea runtime transactional database 310 and the customer models database140, shown in FIG. 1. The runtime transactional database 310 may besimilar to the data included in the training transactional database 110,but may be updated with additional data, such as the current context.

The shopping list prediction runtime module 320 may access a specificcustomer model in the customer models database 140. Each customer modelmay be used to predict a shopping list for a given customer on a givenshopping trip (“transaction”) and to provide other individualizedapplications to the customer. As discussed above, the runtime system 300is applicable in a variety of circumstances in which customers seek toacquire goods and/or services (collectively or individually a “product”or “products”).

Similar to the training system 100, any or all of the runtime system300, including the runtime transactional database 310, shopping listprediction runtime module 320, individualized application runtime module330, may be implemented on one or more computers. The computer mayinclude one or more processors. The processor may include any type ofdevice or devices used to process digital information. The runtimetransactional database 310, shopping list prediction runtime module 320,individualized application runtime module 330, and/or portions of theforegoing may also include one or more computer-readable media, asdescribed below.

Within the computer system, shopping list prediction runtime module 320and individualized application runtime module 330 may be implemented ina computer-readable medium, or an electromagnetic signal that carrieslogic that defines computer-executable instructions for performing thefunctions of the shopping list prediction module 110 and theindividualized application module 130.

Using the customer model, the shopping list prediction runtime module320 generally predicts a list of product categories (which may includegoods and/or services) that a customer will want or need to acquire on agiven shopping trip (a “shopping list”). The product categories mayinclude a grouping of one or more product classes, individual products,or specific types of individual products (include goods and/orservices). The shopping list prediction runtime module 320 may frame theprocess of predicting a shopping list as a classification. Generallyspeaking, classifications may be determined by constructing a procedure(a “classification rule”) to apply to a continuing sequence of cases, inwhich each new case is assigned to one of a set of pre-defined groups.By framing the shopping list prediction issue in this manner, the issuebecomes the construction of a classification rule that, when applied toa particular customer on a particular shopping trip (“transaction”),assigns a particular product category to one of two, potentially binary,groups. The first group includes product categories that are to beacquired by that customer (the “acquire group”), and the second groupincludes product categories that are not to be acquired by that customer(the “do not acquire group”).

As discussed above, a classification rule may be implemented for aparticular customer by assigning a classifier to each product categorythat a particular customer may acquire. Thus, a classifier may beincluded for each product category for each customer with sufficientdata. At runtime, the current context, such as the day, date, and time,may be input to the classifier. The classifier may output a probabilitythat the customer will purchase the product category in this shoppingtrip. Depending on the probability, the product category may be in theacquire group or the do not acquire group. If the classification isrepeated for each product category a particular customer is likely toobtain, a shopping list may be constructed from the product categoriesin the acquire group.

FIG. 4 shows an expanded block diagram of the shopping list predictionruntime module 320. As discussed above, the shopping list trainingmodule 232 may use a plurality of methodologies in which to generate theattributes of a shopping list in the customer model. Examples of thevarious methodologies include rule-based prediction and machinelearning. At runtime, the attributes from the various methodologies maybe accessed, using the attribute extraction module 410, and updatedbased on the runtime current context to generate a shopping list basedon the methodology. For example, accessing the customer model using asub-model generated by a machine learning method, the machine learningruntime module 414 may update the sub-model for the runtime context,including: the replenishment interval at t^(j); the frequency ofinterval at t^(j); the range into which the current acquisition falls;the day of the week of the current shopping trip; the time of the dayfor the current transaction; and the quarter of the year for the currenttransaction. The machine learning runtime module 414 may then generate apredicted shopping list for each product category (such as all of theindividual products offered by the store). Similarly, accessing thecustomer model using a sub-model generated by a rule-based predictionmethod, the rule-based prediction runtime module 416 may update thesub-model for the runtime context, and generate a predicted shoppinglist for each product category.

The hybrid prediction runtime module 412 may provide a hybrid approach,between machine learning and rule-based methods, to generating ashopping list. The hybrid prediction runtime module 412 may select oneof the prediction methods or a combination of the prediction methods inorder to formulate a probability that a particular product category maybe purchased by the customer on this shopping trip. The hybridprediction runtime module 412 may treat each class as independent ofeach other for a given transaction. Therefore, the hybrid predictionruntime module 412 may uses different classification methods fordifferent classes. For example, the hybrid prediction runtime module 412may combine a top-n rule-based classifier created by the rule-basedprediction runtime module 416 with various classifiers created by themachine learning runtime module 414, where if the rule-based predictionmodule 212 (for example, using the top-n predictor (for given n)) ispositive for a given class, the hybrid prediction runtime module 412will predict that the product category included in the class will needto be acquired, otherwise the hybrid prediction runtime module 412 willpredict according to the output of the machine learning runtime module414. Thus, the hybrid prediction runtime module 412 may determine aprobability, based on a single or multiple prediction approaches, forone, some, or all of the product categories.

After the probabilities that the product categories may be purchased onthis shopping trip, the hybrid prediction runtime module 412 may analyzethe probabilities in order to compile a predicted shopping list. Theanalysis may comprise a pre-determined probability score, below whichthe product category is not included on the shopping list. For example,if the pre-determined probability score is 0.7 (from a scale of 0 to1.0), all product categories with a probability of purchase on thisshopping trip with 0.7 or above are included on the shopping list.Alternatively, the pre-determined probability score may be specific toeach product category. For example, yogurt purchases may have apre-determined probability score of 0.8 whereas milk purchases may havea pre-determined probability score of 0.9. Moreover, if after thepredicted list is compiled, the amount of items on the list may dictatereadjustment of the pre-determined probability score(s). For example, ifthe predicted shopping list results in only 2 items on the list, thepre-determined probability score(s) may be lowered so that more itemsmay be placed on the list. Conversely, if the predicted shopping listresults in over 50 items on the list, the pre-determined probabilityscore(s) may be raised so that fewer items may be placed on the list.

Once the predicted shopping list is generated, the list may be sent tothe individualized application runtime module 330. The individualizedapplication runtime module 330 includes one or more applications thatprovide individualized interactions with a particular customer that arecustomized for that particular customer (“individualized applications”).

The individualized application runtime module 330 may receive input fromany module which provides an item or items of potential interest to acustomer. As discussed above, the shopping list prediction runtimemodule 320 is one example of a module which may provide an item or alist of items. Other modules may likewise provide an item as input tothe individualized application runtime module 330. For example, a modulewhich senses the location of the customer in the store, such as in thedairy aisle, may input any dairy type of product to the individualizedapplication runtime module 330.

An example of the individualized application runtime module 330 is shownin FIG. 5. The individualized application runtime module 330 may includea promotion generation module 510 and one or more other modules 512,514, 516, 518, 520, 522, 524, 526 that may access a part of the customerprofile and, based on the goals of the retail establishment, provideinput to the promotion module. The promotions may be provided at anypoint, such as when the customer enters the store, in the midst ofshopping in the store, or at checkout.

Various goals of the retail establishment may be implemented usingmodules 512, 514, 516, 518, 520, 522, 524, 526, as discussed below. Thegoals may include increasing sales, increasing profit, providinginformation, providing a promotion for a third party, etc. For example,the behavior analysis runtime module 512 accesses the behavior analysisattribute of the customer model. As discussed above, the behavioranalysis attribute reflects shopping behavior patterns of a particularcustomer. The behavior analysis runtime module 512 may offer promotionsconsistent with the shopping behavior patterns of the customer.Depending on the behavior and the data available, the promotions may beperformed on a product-by-product basis, or may be done on an aggregateset of products. For example, if the behavior analysis attribute of aparticular customer reflects an emphasis on purchasing organic foods,promotions may be tailored to highlight various organic foods in thegrocery store. In particular, the organic food in the grocery store isoften dispersed throughout the store. A customer may be alerted toorganic food he or she may otherwise be unaware of when the shoppertravels down the aisles of the grocery store. As another example, if thebehavior analysis attribute of a particular customer reflects anemphasis on purchasing Dove® soap, a customer may be notified whenrelated items, such as Dove® bodywash is introduced or is on sale.

As another example, the brand loyalty runtime module 516 may access thebrand loyalty attribute of the customer model. As discussed above, thedegree of brand loyalty may be used to more effectively offerpromotions. Knowledge of the degree of a customer's tendency to buy onebrand of a product over others in a product category, as well as thedegree of the tendency to buy a given brand when available in anyproduct category enables the selection of specific promotion tactics bythe brand loyalty runtime module 516. For example, for very high brandloyalty, it may be appropriate to do “brand extensions”—to introduce newproducts of the same brand. For medium brand loyalty, it may beappropriate to attempt to either raise loyalty with discounts on theexisting brand, or to attempt to switch the customer to a new brand. Forlow levels of loyalty, it may be appropriate for promotions intended toachieve short term revenue gains.

For example, if a particular customer exhibits a low degree of loyaltyfor purchases of yogurt (e.g., the customer does not have a favoritebrand of yogurt), the brand loyalty runtime module 516 may craft apromotion to induce the consumer to try a specific brand of yogurt ifgiven a promotion. Alternatively, if a customer exhibits a high degreeof loyalty for a certain brand, such as Tostitos® chips, it may bepossible to extend the brand to other product categories, such asTostitos® salsa. Brand loyalty may also be used to offer customerpackaged goods companies promotions based on brand usage.

The wallet share runtime module 518 may access the brand loyaltyattribute of the customer model. Typically, grocery stores offer morethan packaged food items; instead, larger grocery stores offer otheritems such as bakery goods, personal hygiene, magazines, toys,electronics, etc. The wallet share attribute may indicate the broadcategories for which a customer tends to use a particular retailer, andthe proportion of the customer's spending that the retailer is receivingfor these categories, as discussed above. Knowledge of a particular'scustomer's spending enables the grocery store to target promotions forcategories where the particular customer is not purchasing (orpurchasing less in proportion to other categories). For example, if aparticular customer typically does not purchase magazines from a grocerystore, the wallet share runtime module 518 may provide a promotion topurchase a magazine. Further, the timing in which the promotion is givento the customer may depend on the customer's location in the store. Forexample, if the magazines are located near the checkout line, thecustomer may receive a promotion when waiting to checkout, as discussedin more detail below.

The price sensitivity runtime module 520 may access the pricesensitivity attribute of the customer model. The price sensitivityattribute may measure how sensitive a specific customer is to prices, asdiscussed above. Using the price sensitivity attribute, the pricesensitivity runtime module 520 may determine whether to offer apromotion to a customer, if a promotion is offer, the parameters of thepromotion to offer the customer. If a customer is price sensitive for aproduct, such as milk, the price sensitivity runtime module 520 maydetermine that a promotion may be a sufficient incentive to try a new ora different brand of milk. Moreover, if a variety of promotions may beoffered to the customer (e.g., 10%, 20%, 30% off; or 2 for 1, 3 for 2,etc.), the price sensitivity runtime module 520 may provide the amountof discount for the promotion given the past history of the customer andthe price sensitivity of the customer, as indicated by the pricesensitivity attribute.

The promotion sensitivity runtime module 522 may access the promotionsensitivity attribute of the customer model to determine whether, and/orwhat type of promotion to offer a particular customer. Referring to FIG.6, there is shown an expanded block diagram of the promotion sensitivityruntime module 522. As discussed above, there are various measures of acustomer's response to promotions including: (1) hoarding; (2) priceefficiency; (3) opportunistic index; (4) coupon index; and (5) salesratio. The promotion sensitivity runtime module 522 may include ahoarding module 602, a price efficiency module 604, an opportunisticindex module 606, a coupon index module 608 and a sales ratio module610.

As discussed above, the hoarding attribute may provide a measure ofwhether a shopper is a “good hoarder” or a “bad hoarder.” Based on thisinformation, the hoarding module 602 may determine whether to provide apromotion to a particular customer, and if so, what type of promotion.The ability to note the degree to which a particular customer “pantryloads” or “hoards” a particular product allows the hoarding module 602to determine whether they are an appropriate candidate to receive apromotion intended to boost overall consumption of the product, withholdany promotion, or provide a promotion to generate short term revenues.

For example, if the customer is a bad hoarder, the hoarding module 602may determine not to provide a promotion to the customer. Or, given thatthe customer has previously exhibited bad hoarding characteristics, thehoarding module 602 may attempt to provide different types of promotionsin an attempt to elicit different behavior from the customer. Forexample, if the customer has previously hoarded for promotions relatingto discounts such as a percentage reduction or a fixed amount off (suchas 10% or $0.50 off of spaghetti), the hoarding module 602 may determinethat a different type of promotion (such as 10% or $0.50 off ofspaghetti and sauce in combination), is warranted. If the customer is agood hoarder, the hoarding module 602 may determine to provide apromotion to the customer, and the type of promotion based on thecustomer's reaction to previous promotions.

The price efficiency attribute may provide a measure of whether ashopper is an indicator of the sale behavior of a customer relative toother customers. In particular, is this customer a “savvy” shopper withregard to how much paid for various items. Based on this information,the price efficiency module 604 may determine whether to provide apromotion to a particular customer, and if so, what type of promotion.The amount of discount for an item may be increased for a customer witha higher price efficiency index than another customer with a lower priceefficiency index.

The opportunistic index attribute may provide a measure of the frequencyin which the customer purchases items on sale. Based on thisinformation, the opportunistic index module 606 may determine whether toprovide a promotion to a particular customer, and if so, what type ofpromotion. The opportunistic index may be applied for various product inthe store, with some products having a negative opportunistic index(typically purchased on sale) and other products having a positiveopportunistic index (typically not purchased on sale). For thoseproducts with a negative opportunistic index, the opportunistic indexmodule 606 may craft a promotion with a reduction in price. Similarly,for those products with a positive opportunistic index, theopportunistic index module 606 may craft a promotion with anadvertisement detailing the benefits of the product without a reductionin price.

The coupon index attribute may provide a measure of how much anindividual price a customer receives. Based on this information, thecoupon index module 608 may determine whether to provide a promotion toa particular customer, and if so, what type of promotion. For example, acustomer with a negative coupon index indicates that the customer mayrespond well to individualized promotions. Therefore, the coupon indexmodule 608 may craft several promotions for the items listed in thecustomer's shopping list.

The sales ratio attribute may provide a measure of the number ofproducts bought during sales to the total number of products. Based onthis information, the sales ratio module 610 may determine whether toprovide a promotion to a particular customer, and if so, what type ofpromotion. If a customer has a low sales ratio attribute, this indicatesthat the customer may be averse to being provided many promotions.Therefore, the sales ratio module 610 may determine that fewerpromotions for percentage reductions on items should be provided to thecustomer, and other promotions, such as advertisements or suggestionsfor recipes, etc. may be more beneficial. Conversely, if a customer hasa higher sales ratio attribute, the sales ratio module 610 may determinethat a greater number of promotions for percentage reductions iswarranted.

The product category substitution runtime module 524 may access theproduct category substitution attribute of the customer model todetermine whether, and/or what type of promotion to offer a particularcustomer. The product category substitution attribute may identify aproduct category (such as an individual product) that may be substitutedfor one or more product categories (such as one or more products) on ashopping list. Given a shopping list, the product category substitutionruntime module 524 may review the list for any potential substitutionsin product categories. If items are in the same category and are rarely(or never bought together), then one item may be recommended forsubstitution of another item. For example, one product on the predictedshopping list may comprise Dannon® strawberry yogurt. A potentialsubstitute product may comprise the store-brand strawberry yogurt.

The basket variability runtime module 526 may access the basketvariability attribute of the customer model to determine whether, and/orwhat type of promotion to offer a particular customer. The basketvariability attribute is an indicator of how much a given customer'stotal spending during a visit tends to vary from visit to visit.Awareness of a customer's basket variance can be used to choose betweenmargin maximization tactics (e.g. discount on high margin products) orrevenue maximization tactics (e.g. discounts on larger pack sizes). Inparticular, using the basket variance, the basket variability runtimemodule 526 may offer promotions designed to grow basket size (such as, 3products for the price of 4), or promotions designed to maximize margin(such as 10% off a high margin brand). For example, if the customer hasa high variability (i.e., the variance is significantly greater thanaverage so that the customer does not have a set amount of spending fromone visit to the next), the basket variability runtime module 526 mayoffer promotions intended to grow basket size. For example, the basketvariability runtime module 526 may determine that a “buy 2 get 20% off”promotion is more appropriate than a “buy 1 get 10% off” since thecustomer does not have a fixed basket size. Alternatively, if thecustomer has a low variability, the basket size may be more constant.The basket variability runtime module 526 may then attempt to move thefixed basket customer to a higher margin product, such as offering apromotion to purchase a store brand.

The individualized application runtime module 330 may also include ananonymous profiling runtime module 514. When a customer, for privacy orother reasons, chooses not to be identified, the individualizedapplication module 300 may provide the anonymous customer a limitedlevel of individualized interaction. The anonymous profile runtimemodule 514 may use the current transactional data to create anincremental profile, and provide some information to the anonymouscustomer. For example, the anonymous profile runtime module 514 may usethe product categories acquired during a transaction. The anonymousprofile runtime module 514 may receive such information from a productidentification system, such as a scanner, to identify products selectedby the anonymous customer in the course of the anonymous customer'scurrent transaction. As the customer shops a profile is built, on thefly, as each additional item is scanned. As the profile grows it can bematched to existing, more detailed profiles from which detailedpredictions can be made. While this will be less accurate than relyingon profiles built over time from known customers, it is enough toprovide some of the same benefits.

The promotion generation module 510 may be in communication with any oneof the modules 512, 514, 516, 518, 520, 522, 524, 526 to receivepromotions. For example, the promotion generation module 310 may comparethe percentile price that a customer typically pays for a particularproduct to that paid by other customers to restrict promotions to onlythose customers who need an additional inducement to trigger anacquisition. Further, if a single promotion is offered for an item, thepromotion generation module 510 may provide the promotion to thecustomer interface system 340. If multiple promotions are offered for anitem, the promotion generation module 510 may reconcile between the twopotential promotions, such as selecting one of the promotions, orportions of both promotions, and provide it to the customer interfacesystem 340.

The customer interface system 340 may include systems for identifyingand communicating with one or more customers. For example, the customerinterface system 340 may, separately or in any combination, include aninput device and an output device. The output device may be any type ofvisual, manual, audio, electronic or electromagnetic device capable ofcommunicating information from a processor or memory to a person orother processor or memory. Examples of output devices include, but arenot limited to, monitors, speakers, liquid crystal displays, networks,buses, and interfaces. The input device may be any type of visual,manual, mechanical, audio, electronic, or electromagnetic device capableof communicating information from a person, or memory to a processor ormemory. Examples of input devices include keyboards, microphones, voicerecognition systems, trackballs, mice, networks, buses, and interfaces.Alternatively, the input and output devices may be included in a singledevice such as a touch screen, computer, processor or memory coupledwith the processor via a network. For example, the input and outputdevices may include an infrared transmitter and receiver forcommunicating with a customer's portable computer or personal digitalassistant (“PDA”).

The input device, whether alone or combined with the output device mayallow a customer to communicate identifying information to the System300. For example, the customer may enter an identification a password orcustomer number that uniquely identifies the customer into the Systemvia a keyboard or touch-screen. The customer interface system 340 mayadditionally or alternatively include a card reader or other such devicethat can obtain a customers identifying information from a credit card,bank card, frequent shopper card, loyalty card, or any other such cardcontaining information that uniquely identifies the customer.Alternatively, biometric devices can be used, including, but not limitedto fingerprint readers, voice recognition, face recognition, andsignature recognition. In addition, the input device, whether alone orcombined with the output device may include a device or system forgathering customer transaction data. For example, the input device mayinclude one or more barcode scanning systems to identify and trackconsumer acquisitions at checkout or while the customer is shopping.

Further, promotions offered to the customers or promotions accepted bycustomers (such as the customers purchasing the product or service) maybe subject to billing. As discussed in more detail below, the promotiongiven to the customer may be part of a larger promotion plan. Billingfor the promotion, either in terms of billing for an impression of thepromotion or for acceptance of the promotion, may be performed by abilling module 350. If billing is based on an impression of thepromotion, the billing module 350 may record billing information whenthe individualized application module 330 sends a promotion to thecustomer interface system 340. The billing information may be used tocalculate a fee for the service of providing the promotion.Alternatively, if billing is based on acceptance of the promotion, thebilling information may be recorded after comparison, for a specificshopping trip, of the promotions offered the customer and thetransactions made by the customer to determine if the customer acceptedthe promotion. Moreover, the fee may be constant for every impression oracceptance of the promotion. Alternatively, the fee may depend on thecustomer model for the customer receiving the impression or acceptingthe promotion or may depend on the goals of the promotion. For example,the fee may depend on ratings of certain attributes in the customer'smodel, such as the brand loyalty attribute. A higher fee may be chargedfor an acceptance of a promotion for a customer with a higher brandloyalty attribute rating and a lower fee may be charged for anacceptance of a promotion for a customer with a lower brand loyaltyattribute rating. As another example, a promotion may have definedgoals, such as brand switching, brand extensions, etc. The fee may bebased on the outcome, such as one fee for a brand switch or another feefor a brand extension. The fee may then be billed to the party whoseproduct is the subject of the promotion.

An individualized customer interaction system, such as that shown inFIG. 3, may be adapted for a variety of circumstances. For example, suchan individualized customer interaction system may be implemented in aretail store, such as a grocery store, to provide customers with ashopping list and individualized promotions.

An example of an individualized customer interaction system for agrocery store is shown in FIG. 7. The Grocery System 700 may beimplemented in a location used to sell groceries, such as a grocerystore, and provides individualized interaction with a customer for theduration of their visit to such location. The term “grocery store” willbe used in this document to refer to any location used to sellgroceries, including, but not limited to a market, store, mall or othersuch location, whether in or out of doors. The Grocery System 700generally includes a shopping list prediction module 710, transactionaldata system 720, individualized application module 730, and customerinteraction system 740. These modules include all the components andfeatures, in any combination, as described in connection with theindividualized customer interaction system (see FIG. 3), except asotherwise indicated.

The customer interface module 740 enables communication with a customerthroughout the duration of a transaction. The customer interface module740 may include a check-in terminal 744 and a customer identificationmodule 742 for identifying the customer, and one or more access points748, 750, 752, a customer locator module 746 and a mobile customerinterface module 754 that keeps track of a customer's location withinthe grocery store.

The check-in terminal 744 may be in communication with the customeridentification module 742. Alternatively, the check-in terminal 744 andthe customer identification module 742 may be included in a singledevice. The check-in terminal may include an interface storage space forstoring one or more mobile customer interfaces 754, and one or moreinput and/or output devices as previously described. A customer mayprovide identifying information to the Grocery System 700 using theinput and/or output device. This identifying information may then becommunicated to the customer identification module 742, which maycompare the information with information stored in a database (such asthe runtime transactional database 722) to identify the customer. If thecustomer identification module 742 successfully identifies the customer,the check-in terminal may allow the customer to check-out a mobilecustomer interface module 754.

The mobile customer interface module 754 may include a PDA, touch-screenor other such device that provides communication between the customerand the Grocery System 700. The mobile customer interface module 754 maybe of a size that is convenient for the customer to carry or be easilyattached to a shopping cart, dolly, or other such device. In addition,the mobile customer interface module 754 may include a wirelesscommunication system for wirelessly communicating with the GrocerySystem 700 via one or more access points located throughout the grocerystore. This wireless communication system may include an antenna and amodem or router for communicating via a wireless protocol such as IEEE802.11.

One or more access points 748, 750, 752 may be located at variouslocations in the grocery store. The access points 748, 750, 752 maygenerally include a communication system that is compatible andcomplimentary to that of the mobile customer interface module 754. Thecommunication range of each of the access points is limited so thatthere is limited overlap of the communication ranges of adjacent accesspoints. However, the access points may be sufficient in number so thatthe combination of their communication ranges covers the entire grocerystore. Each access point 748, 750, 752 may only communicate with amobile customer interface module 754 when such a module 754 is locatedwithin the communication range of that particular access point 748, 750,752, respectively.

In addition, each access points 748, 750, 752 may include an identifier,such as an alphanumeric sequence, which is unique to that particularaccess point 748, 750, 752, respectively. Therefore, when an accesspoint, for example access point 748, communicates with a mobile customerinterface module 754, the access point communicates its identifier tothe customer locator module 746. The customer locator module 746 may usethe identifier to determine that the mobile customer interface module754 is located within the communication range of the access point 748,thus locating the customer. Various approaches can be used to locate themobile device, from IR and Bluetooth beacons, to determine the locationof the last scanned item.

Because the Grocery System 700 is able to track the location of acustomer, the Grocery System 700 is able to provide individualizedinteraction with the customer. As discussed above, the individualizedinteraction may be based in part on the customer model. Further, thetiming to convey the individualized interaction with the customer mayvary. For example, promotions for all of the items on the predictedshopping list may be provided at the same time. Or, because of thepotential to inundate the customer with information, the promotions maybe paced from one another, depending on the customer's location in thegrocery store. For example, if the customer is in the dairy aisle,promotions relating to any one or all of the dairy items may be providedto the customer.

The Grocery System 700 may further include a transactional data system720. The runtime transactional data system 720 may include atransactional database 722 and a transactional data collection module724. The transactional data collection module 724 may include a device,such as a bar code scanner, for identifying the product categories acustomer intends to purchase. The transactional data collection module724 may be located at the grocery stores checkout counter to determinethe product categories actually purchased by customer at the end of aparticular transaction. The customer may be identified by thetransactional data collection module 724 in a variety of ways, such asby scanning a store loyalty card or using biometric identificationtechniques. Alternately, the transactional data collection module 724may be located next to or within the access points 748, 750, 752. Thisallows the Grocery System to identify the product categories a customeris intending to buy during the transaction.

Many of the components of the Grocery System 700 may be implemented in acomputer system. For example, the shopping list prediction runtimemodule 710, individualized application runtime module 730, and portionsof the transactional database system 720, such as the transactionaldatabase 722, and the customer interface system 740, such as thecustomer locator module 746 and the customer identification module 742,may be implemented in a computer system.

C. Example of Application to Grocery Store

In practice, predicting grocery shopping lists is interesting as alearning problem because of the sheer number of classes that must bepredicted. Abstracting from the lowest product category level-theproduct level (which includes about 60,000 product categories) to thelevel of relatively specific product categories that may be useful forgrocery lists reduces this number to a degree. However, for real worlddatasets, the number of classes may be from fifty to a hundred classesper customer, with tens of thousands of regular customers per grocerystore.

In general, the metrics used to evaluate the performance of the shoppinglist predictors per class are the standard recall, precision, accuracyand f-measure quantities. For a set of test examples, recall is definedas the number of true positive predictions divided by the number ofpositive examples. Precision is defined as the number of true positivepredictions over the total number of positive predictions. Accuracy isdefined as the number of correct predictions divided by the total numberof examples. F-measure is defined as the harmonic mean of recall andprecision as defined by the following equation:

$\begin{matrix}\frac{2*{recall}*{precision}}{{recall} + {precision}} & (2)\end{matrix}$

There are many considerations to take into account to obtain an overallmeasure of performance by which success may be measured when predictingshopping lists for large groups of customers. Typically, in a learningscenario with a large number of classes, the metrics, such as thosepreviously described, may be aggregated in several ways. Microaveragedresults may be obtained by aggregating the test examples from allclasses together and evaluating each metric over the entire set. Analternative includes microaveraging the results. Microaveraging theresults includes evaluating each metric over each class separately, andthen averaging the results over all classes. The first alternative tendsto produce higher results than the second alternative. This occursbecause when the number of classes is large and very unbalanced, themicroaveraged results are implicitly dominated by classes with a largenumber of examples, while the microaveraged results are dominated byclasses with a smaller number of examples. Macroaveraging provides ameasure of how the shopping list runtime module 320 performs for themajority of customers rather than just those with a large number oftransactions.

However, the transactional nature of the transaction datasource makes itpossible to aggregate in additional ways. One option would be toaggregate all examples associated with a single customer, obtain resultsfor the metrics discussed previously for each set, and average them(“Customer Averaging”). Customer Averaging shows how the shopping listprediction runtime module 320 performs for the average customer.Although these aggregate sets still unbalanced, given that somecustomers shop more than others, the average results for CustomerAveraging are generally between those of the micro and macro-averagingapproaches. Another option is to aggregate on the transaction level(“Transaction Averaging”). Using Transaction Averaging, all the examplesfrom each transaction are aggregated, each metric is calculated, and theresults are averaged over all transactions. Transaction Averaging maydetermine, per trip, how many of the categories that were predicted (inother words, included on the shopping list) were acquired, and how manyof the acquired categories were predicted. However, because TransactionAveraging breaks up examples sets within classes, it may be difficult tocompare the results of Transaction Averaging with those of the otheraggregation techniques.

The shopping list prediction runtime module 320 was tested using datafor several thousand customers. The dataset contained transaction datadescribing the purchases made by over 150,000 customers in a grocerystore over two years. From this overall set, 22,000 of the customersshopped between 20 and 300 times, which was a legitimate population forwhom to predict shopping lists. This population was sampled to produce adataset of 2200 customers with 146,000 associated transactions. Becausethe number of transactions for each customer followed a power law,uniform random sampling to select 10% of the customers would haveresulted in a sample skewed towards customers with small number oftransactions. To obtain a representative sample, the population wassplit into deciles along three attributes: total amount spent, totalnumber of transactions, and

$\frac{\# {transactions}}{amountspent}.$

For each set of deciles, 10% of the data was selected with uniformprobability from each decile. The 10% samples obtained for eachattribute were found to be statistically similar to the other two.Therefore, the final sample used was taken from total amount spent.

The transactional information included, in addition to the attributesdescribed in the previous section, lists of product categories purchasedduring each transaction. Products were arranged in a hierarchy ofproduct categories, of increasing generality. At a fairly specific levelof this hierarchy, the product categories resembled grocery shoppinglist items. Examples of these product categories included: cheddarcheese, dog food, sugar, laundry detergents, red wine, heavy cream,fat-free milk, tomatoes, and other grocery items. In total, 551 productcategories were represented in the dataset forming the set P as definedpreviously. Customers within the sample bought 156 distinct productcategories on average (with a standard deviation of 59). Of theseproduct categories, the set P_(c) for each customer was restricted toinclude only the product categories bought during 10% or more of thecustomer's transactions. Therefore, the average size of P_(c) for agiven c was 48 (with standard deviation of 27.59).

For each transaction for each of the customers in the sample, exampleswere constructed as described above. The datasets for each categoryranged from 4 to about 240 examples. For each class in the resultingdataset, the example sets were split into a training set, which includedthe first 80% of examples in temporal order, and a test set, whichincluded the last 20%.

The shopping list prediction runtime module 320 was tested a variety ofmethodologies, including rule-based, machine learning and hybridapproaches. The rule-based methods were run on the test sets to provideconsistency in evaluation. For the top n methods, a cutoff of 10categories was chosen. For the decision tree classifier, C4.5 was usedwith 25% pruning and default parameterization. For the linear methods,the SNoW learning system was used (see, A. Carlson, C. Cumby, 3. Rosen,and D. Roth, “The SNoW learning architecture. Technical ReportUIUCDCS-R-99-2101,” UIUC Computer Science Department, May 1999). SNoW isa general classification system incorporating several linear classifiersin a unified framework. The classifiers were trained with two (2) runsover each training set.

Results of the test performed on the shopping list prediction runtimemodule 320 using the various approaches are shown in Tables 1 and 2below, broken down in terms of the transaction and customer averagingmethods.

TABLE 1 Recall Prec F-Measure Accuracy Random .19 .20 .19 .65 Sameas .26.26 .26 .70 Top-10 .37 .35 .36 .59 Perceptron .38 .26 .31 .65 Winnow .17.36 .23 .79 C4.5 .22 .34 .24 .77 Hybrid-Per .59 .28 .38 .53 Hybrid-Win.43 .36 .39 .65 Hybrid-C4.5 .46 .35 .40 .62

TABLE 2 Recall Prec F-Measure Accuracy Random .21 .19 .20 .65 Sameas .25.29 .27 .70 Top-10 .41 .33 .37 .65 Perceptron .40 .27 .32 .66 Winnow .17.38 .24 .79 C4.5 .25 .28 .26 .70 Hybrid-Per .60 .27 .37 .55 Hybrid-Win.44 .32 .37 .64 Hybrid-C4.5 .48 .34 .40 .62

FIGS. 8 and 9 include graphs that show the performance of the shoppinglist prediction runtime module 320 using the top-n approach fordifferent values of n.

For the shopping list prediction runtime module 320 using the linearclassification methods, the activation values output by the shoppinglist prediction runtime module 320 were normalized to produce aconfidence score for each class. Then, a threshold different from thethreshold used in training was chosen to test the shopping listprediction runtime module 320 performance. FIGS. 10 and 11, show theperformance of the shopping list prediction module 110 using Winnow andPerceptron classifiers at different confidence thresholds. Theactivations were normalized to confidence values between −1 and +1, withthe original training threshold mapped to 0.

As previously discussed, one application for predicting shopping listsis to reclaim forgotten purchases. However, the dataset used in testingdid not include information on the instances in which categories wereforgotten. Further, assumptions about the instances in which forgettinghad occurred were not made. However, these methodologies should besomewhat robust to label noise as long as they are not overfitting thedata. In order to estimate this robustness and determine the value ofthe predicted forgotten purchases, some assumptions are made about thedistribution of the instances of forgotten purchases and noisy labelvalues were corrected in the test data. Training was performed on thenoisy data, and then an evaluation was performed on the corrected testdata, to demonstrate an increase in the number of true positivepredictions without a serious increase in false negatives.

The manner to estimate noisy labels in the test data to correct isdescribed as follows. First, for each class pεP_(c) for a givencustomer, the mean μ and standard deviation σ of the replenishmentinterval i². were determined. Next, examples for which i≧μ+c*σ fordifferent constants c were identified. For each of these examples thathave negative labels, a determination as to whether any example within awindow of k following transactions was positive. Each of these exampleswas estimated to be an instance of forgetting, with noisy negativelabels.

To evaluate the robustness of the shopping list prediction runtimemodule 320 predictors to this noise, each noisy negative label waschanged to be positive, and each classification method was re-evaluatedon the modified test data. The transaction averaged results of thisevaluation are summarized in Table 3 below for c=1³.

TABLE 3 Recall Prec F-Measure Accuracy Random .20 .21 .20 .64 Sameas .23.28 .26 .69 Top-10 .37 .36 .37 .65 Perceptron .42 .31 .36 .61 Winnow .16.40 .23 .75 C4.5 .22 .39 .28 .73 Hybrid-Per .60 .32 .42 .54 Hybrid-Win.43 .41 .42 .65 Hybrid-C4.5 .46 .38 .42 .62

II. Promotion Planning

As discussed above, one aspect of a retail establishment's business ispromotion planning. The retail establishment may wish to improve itspromotion planning in one of several ways including selection of theparameters of the promotion and simulating the promotion. Promotionplanning may include reviewing and reasoning about the goals,parameters, and results of a promotion for a single product, whilesimulating these promotions for each customer using their personalprofile. This may be accomplished via the promotion planning method andsystem described below.

The method and system allow a user to modify the purposes of eachpromotion using a set of high-level goals, which are mapped to thepractical parameters of a sale to produce general rules of the typementioned above. By simulating the effects of a promotion on eachcustomer targeted with respect to specific retailer/manufacturer goals,the method and system allow a completely new type of pricing model fortrade promotions, bringing the pay-for-performance philosophy to adomain that has traditionally been administered on a very crude basis.The method and system also offers the advantage of building andsimulating sets of rules collectively and evaluating their interactionsrather than manually in isolation. Below is describe the operations ofthe method and system from goal selection, to optimization, simulation,and pricing.

With regard to goals, promotion planning is often difficult in terms oftrying to select the parameters of the promotion which may meet thegoals of the promotion. Examples of general goals include, but are notlimited, to brand, revenue, lift, and market share. Brand goals aregenerally related to a specific brand, and can include: (1) brandswitches (e.g., the number of switches to the brand which is subject tothe promotion); (2) brand extensions (e.g., number of purchases of aproduct which is related to a brand); (3) new trials of a product; and(4) loyalty rate for existing customers. Revenue goals may be classifiedas: (1) short term revenues (increase in percentage or amount of salesover a predetermined period, such as the following 2 weeks); (2) longterm revenues (increase in percentage or amount of sales over apredetermined period, such as the following 3 months); (3) trend; and(4) brand revenue (revenue for a brand as a whole, such as the entireIvory® line of products, or revenue for a specific brand product, suchas Ivory® soap). Lift goals may relate to an increase in volume ofcurrent sales without regard to revenue. Market share relates to anincrease in percent of the market share relative to another party in themarket.

The parameters of the promotion may include any one or all of thefollowing: (1) the duration of the promotion (such as the number of day,weeks, months, etc.); (2) the discount applied (such as a percentagereduction, a fixed amount reduced from the price, a sale price, etc.);(3) any values for the attributes of the customer model (such as rangesof values for behavior, brand loyalty, wallet share, price sensitivity,promotion sensitivity, product category substitution, basketvariability, frequency of shopping). For example, attributes may includeany one or all of the following: (1) minloy, maxloy, which may be theminimum and maximum loyalty scores for the consumers in the targetgroup; (2) minhoard, maxhoard, which may be the minimum and maximumhoarding scores for the consumers in the target group; (3)minsensitivity, maxsensitivity, which may be the minimum and maximumprice sensitivity scores for the consumers; and (4) mintrial, maxtrial,which may be the minimum and maximum new trial rate scores for theconsumers.

Referring to FIG. 12, there is shown a block diagram of a promotionplanning system 1200. The promotion planning system 1200 includes aninitial parameter selection module 1210, which may select initialparameters for the promotion. Instead of using an ad-hoc process ofselecting parameters for a promotion, one aspect of the invention is toderive the parameters from optimization of the goals of the promotion.As discussed above, there may be a variety of goals of a promotion. Thegoals of the promotion may be entered via an input/output device 1250,and transmitted to the initial parameter selection module 1210.

The following is a representation of the promotion as a function of thegoals of the promotion and the parameters of the promotion:

f(x ₁ . . . x _(n))=c ₁·brand(x ₁ . . . x _(n))+c ₂·lift(x ₁ . . . x_(n))+c ₃·market(x ₁ . . . x _(n))+c ₄·rev(x ₁ . . . x _(n))  (3)

with x₁ . . . x_(n) being the parameters for the promotion, and thecoefficients c₁, c₂, c₃, and c₄ represent the coefficients for thebrand, lift, market and revenue goal functions. The coefficients maycomprise weights of the goals relative to one another. For example, thefunction for the brand goal may be represented as a function of thesub-goals and the parameters of the promotion:

brand(x ₁ . . . x _(n))=b ₁·switch(x ₁ . . . x _(n))+b ₂·ext(x ₁ . . . x_(n))+b ₃·trials(x ₁ . . . x _(n))+b ₄·loyalty(x ₁ . . . x _(n))  (4)

where the coefficients b₁, b₂, b₃, and b₄ represent the values for thesub-goals of brand switches, brand extensions, new trials, and loyalty.Further, the coefficients may represent the weights of the sub-goalsrelative to one another. For example, the function for the switchsub-goal may be represented as:

$\begin{matrix}{{{switches}\mspace{14mu} \left( {x_{l}\mspace{14mu} \ldots \mspace{14mu} x_{n}} \right)} = \frac{\begin{matrix}{\left( {\frac{duration}{avg\_ repl} - {{conv\_ rate} \cdot \frac{\left( {{\min \mspace{14mu} {loy}} + {\max \mspace{14mu} {loy}}} \right)}{2}}} \right) \cdot} \\{{custs}\left( {{\min \mspace{14mu} {loy}},{\max \mspace{14mu} {loy}}} \right)}\end{matrix}}{{duration}^{\; 2}}} & (5)\end{matrix}$

where avg_repl is the average replenishment rate of customers who buythe target category, and conv_rate is the number of promotion instancesneeded to switch a customer with 100% loyalty to a competing brand(estimated from training transactions). The custs quantity is estimatedby assuming the number of customers is distributed normally with respectto brand loyalty, and using the normal cumulative distribution function:

$\begin{matrix}{{p(x)} = {\frac{1}{\sigma \sqrt{2\pi}}{\int_{\min \; {{loy}/\sigma}}^{\max \; {{loy}/\sigma}}{\frac{- \left( {t - \mu} \right)^{2}}{2\; \sigma^{2}}{t}}}}} & (6)\end{matrix}$

where μ is the mean loyalty of the customers and σ is the standarddeviation. The set of constraints C may contain equality/inequalityconstraints over any of the input variables x₁ . . . x_(n) to expressrules such as promotions on product p may never exceed 50 days induration.

Functions for the other sub-goals of the brand goal (ext, trials, andloyalty) and sub-goals for the lift, market, and revenue goals maysimilarly be obtained.

B. Optimization

The initial parameter selection module 1210 may optimize the objectivefunction for the promotion f(x₁ . . . x_(n)) for any one, some, or allof the promotion parameters x₁ . . . x_(n). The optimization may beperformed using non-linear optimization. Further, the optimization maybe a local optimization or a global optimization. In addition to thegoals, some of the parameters of the promotion may be given ranges. Forexample, the duration of the promotion may be given a range of 1 week to10 weeks, and the optimization may optimize the duration parameter towithin the prescribed range.

The initial parameter selection module 1210 may thus output to thecustomer selection module 1220 suggested parameters for the promotion.The suggested parameters may include various attributes, such as brandloyalty. The customer selection module 1220 may access the availableconsumer profiles, discussed above, for the retail establishment andselect a subset of available consumer profiles based on the suggestedparameters. For example, if a customer has attributes within the rangesprescribed for (1) minloy, maxloy; (2) minhoard, maxhoard; (3)minsensitivity, maxsensitivity; and (4) mintrial, maxtrial, the customermay be part of the subset.

More specifically, let x₁ . . . x_(n) be the parameter variablesdescribed above. A hierarchical multi-objective optimization problem maybe defined of the following form:

$\begin{matrix}{{\underset{x_{1}\mspace{14mu} \ldots \mspace{14mu} x_{n}}{{argmax}\; {f\left( {x_{1}\mspace{14mu} \ldots \mspace{14mu} x_{n}} \right)}} = {\underset{x_{1}\mspace{14mu} \ldots \mspace{14mu} x_{n}}{argmax}\begin{bmatrix}{f_{brand}\left( {x_{1}\mspace{14mu} \ldots \mspace{14mu} x_{n}} \right)} \\{f_{revenue}\left( {x_{1}\mspace{14mu} \ldots \mspace{14mu} x_{n}} \right)} \\{f_{lift}\left( {x_{1}\mspace{14mu} \ldots \mspace{14mu} x_{n}} \right)} \\{f_{brand}\left( {x_{1}\mspace{14mu} \ldots \mspace{14mu} x_{n}} \right)}\end{bmatrix}}}{{{wrt}\mspace{14mu} C} = \left\{ {c_{1}\mspace{14mu} \ldots \mspace{14mu} c_{k}} \right\}}} & (7)\end{matrix}$

where the set C of constraints on x₁ . . . x_(n) are given by the user.f(x₁ . . . x_(n)) is reformulated as a weighted sum:

f(x ₁ . . . x _(n))=g ₁ ·f _(brand)(x ₁ . . . x _(n))+g ₂ ·f_(revenue)+(x ₁ . . . x _(n))+g ₃ ·f _(lift)(x ₁ . . . x _(n))+g ₄ ·f_(mshare)(x ₁ . . . x _(n))  (8)

Each term in this sum may itself be expressed as a weighted sum,yielding a single objective function. The subobjectives may be asfollows:

f _(brand)(x ₁ . . . x _(n))=b ₁·switches(x ₁ . . . x _(n))+b₂·extensions(x ₁ . . . x _(n))+b ₃·newtrials(x ₁ . . . x _(n))+b₄·loylevel(x ₁ . . . x _(n))  (9)

f _(revenue)(x ₁ . . . x _(n))=r ₁·shortrev(x ₁ . . . x _(n))+r₂·longrev(x ₁ . . . x _(n))+r ₃·brandrev(x ₁ . . . x _(n))  (10)

f _(lift)(x ₁ . . . x _(n))=l ₁·shortlift(x ₁ . . . x _(n))+l₂·longlift(x ₁ . . . x _(n))+l ₃·brandlift(x ₁ . . . x _(n))  (11)

f _(mshare)(x ₁ . . . x _(n))=m ₁·prodshare(x ₁ . . . x _(n))  (12)

To solve the non-linear optimization task, a sequential quadraticprogramming procedure may be employed.

B. Simulation

The subset of consumer profiles may then be output to the promotionsimulator 1230. The promotion simulator 1230 may simulate the promotionusing the subset of the consumer profiles. Specifically, because theconsumer profiles are individualized and personalized, the profilesbetter represent the consumers. The promotion may be “offered” to thesubset of customers via the subset of customer profiles, therebysimulating the results. Therefore, the results of the simulation usingthe consumer profiles may be more accurate.

The system may show a user simulations of promotional results directlyrelated to the goals of the promotion, by creating promotional rulesbased off the parameters described above and applying these rulesiteratively to each customer. Heuristic measures may then be applied togauge the results related to each goal defined above. These heuristics,while derived from the customer transactional data, may not besystematically evaluated in terms of their empirical accuracy until atrue user test can be arranged. Many other sets of heuristics or learnedmodels could be created to explain the results. For each heuristic h_(i)described below, h, is summed over all customers to produce the finalsimulated result.

1. Brand Heuristics

$\begin{matrix}{\mspace{79mu} {h_{switch} = \left\{ \begin{matrix}{result\_ prob} & {{{if}\mspace{14mu} {numvisits}} \geq {{loy}_{other} \cdot {conv\_ rate}}} \\0 & {else}\end{matrix} \right.}} & (13) \\{h_{extensions} = \left\{ \begin{matrix}{result\_ prob} & {{{if}\mspace{14mu} {numvisits}} \geq {\left( {1 - {loy}_{this}} \right) \cdot {conv\_ rate}}} \\0 & {else}\end{matrix} \right.} & (14) \\{h_{newtrials} = \left\{ \begin{matrix}{result\_ prob} & {{{if}\mspace{14mu} {numvisits}} \geq {\left( {{loy} + {newtrial}} \right) \cdot {conv\_ rate}}} \\0 & {else}\end{matrix} \right.} & (15) \\{\mspace{79mu} {h_{loyalty\_ level} = {{{sens}({discount})}*{loy\_ change}*{num\_ visits}}}} & (16)\end{matrix}$

where result_prob=sens(discount)·(base−discount). sens(discount) may bea price sensitivity function, which may be a distribution that may becalculated for each customer and each product over all of the differentprice points giving probability for purchasing the product at a certaindiscount. In the above heuristics, the average replenishment rate percustomer repl_rate is used to calculate the quantity numvisits as

$\frac{duration}{repl\_ rate}.$

the number of visits by the customer to obtain a conversion for theassociated result. For example, the conv_rate for switching may be thenumber of visits to switch a customer with 100% loyalty (loy_(other)) toanother brand. Therefore, when the loyalty brand is lower, fewer visitsare necessary for a switch. As another example, the conv_rate forextensions is the number of visits necessary to obtain a brand extensionfor a customer who is 100% loyalty to the brand (loy_(this)) subject tothe extension. Still another example, the conv_rate for extensions isthe number of visits necessary to convert a customer with 100% loyaltyto any brand (boy). For h, loy_change is the average difference inloyalty seen for this customer after utilizing a promotion for the givenproduct in the past, scaled by the probability of their taking thepromotion.

2. Revenue Heuristics

The revenue heuristics may encode the relative increase or decrease inrevenues in the short-term (promotion duration) and long term (e.g., 4replenishment rates after promotion).

$\begin{matrix}{h_{short\_ term} = {\left( {{base} - {discount}} \right) \cdot {{sense}\left( {{base} - {discount}} \right)} \cdot {num\_ visits}}} & (17) \\{\mspace{79mu} {h_{long\_ term} = \left\{ \begin{matrix}{hoarding\_ score} & {{{if}\mspace{14mu} {{sens}\left( {{base} - {discount}} \right)}} > {.5}} \\0 & {else}\end{matrix} \right.}} & (18)\end{matrix}$

where base is the base price for the product, discount is the discountoffered, the discount price is base−discount, sens(base−discount) is theprice sensitivity to the discount price, and hoarding_score is thedifference in revenue over next several replenishment cycles (i.e.,difference in the baseline revenue for this customer). The brand revenueheuristic may be evaluated by summing either the short or long termrevenue heuristics over all products in the brand. Therefore, therevenue heuristics are an indicator of the incremental revenue (i.e.,difference in revenue between what is expected with and withoutpromotion), both in the short term and long term.

3. Lift Heuristics

$\begin{matrix}{\mspace{79mu} {h_{short\_ term} = {\left( {{{sens}\left( {{base} - {discount}} \right)} - {{sens}({base})}} \right) \cdot {num\_ visits}}}} & (19) \\{h_{long\_ term} = {\left( {{{sens}\left( {{base} - {discount}} \right)} - {{sens}({base})}} \right) \cdot \frac{hoarding\_ score}{base}}} & (20)\end{matrix}$

where the sens(base−discount) is the price sensitivity of the customerto the discount price, and the sens(base) is the price sensitivity ofthe customer to the base price. The brand life heuristic may beevaluated by summing either the short or long term lift heuristics overall products in the brand. Therefore, the lift heuristics are anindicator of the incremental lift (i.e., difference in volume betweenwhat is expected with and without promotion), both in the short term andlong term.

4. Market Share Heuristics

$\begin{matrix}{h_{market\_ share} = {h_{switches} + \frac{h_{extensions}}{avg\_ ext}}} & (21)\end{matrix}$

where avg_ext is the average number of extensions over all the brands inthe category. h_(market) _(—) _(share) is then the estimatedincrease/decrease in market share in terms of loyal customers over thenext promotion period.

FIG. 13 shows an example of an output from the promotion simulator 1230.The output is for a specific brand of whisky. As discussed above, thepromotion simulator may be for any product category. The projectiondetails show one representation of the results of the simulation. Theprojection details may include, but are not limited to: (1) the durationof the promotion; (2) the number of customers targeted by the promotion(i.e., the subset of the customers); (3) the number of expected visits(based on the frequency of visits attribute for the customer, the lastvisit of the customer, and the duration, the number of visits may becalculated for each customer in the subset and summed); (4) the numberof expected impressions (i.e., the number of times consumers arepresented with the promotion); (5) the average number of impressions perswitch; (6) the brand switches because of the promotion; (7) the brandextensions because of the promotion; (8) the new trials of the brand;(9) the non-promotion volume (i.e., the number of units of the brandsold which are not tied to the promotion); (10) the promotion volume(i.e., the number of units of the brand sold which are tied to thepromotion); (11) promotion cost (i.e., the total cost of the discountsfor the promotion volume; (12) the discount per impression; (13) thecost per switch (i.e., the promotion cost divided by the number of brandswitches); (14) the revenues from the promotion; and (15) theincremental revenue for a predetermined number of replenishment cyclesdue to the promotion. Further, the incremental lift for a predeterminednumber of replenishment cycles due to the promotion may be determined.

The average impressions required per brand switch may be calculated in avariety of ways. One way is based on data from acceptance and rejectionof previous offers. If the previous offer is for a similar productand/or a similar promotion, the average impressions per switch may beused. Or, the data for a similar product and/or a similar promotion maybe used to extrapolate an equation that is a function of the amount ofdiscount offered and the number of times the promotion is offered.Another way is to assume a functional form, such as linear or quadratic,for the average impressions required per brand switch. For example, theaverage impressions required per brand switch may be a linear equationand be a function of brand loyalty (with lower brand loyalty requiringfewer average impressions per switch and higher brand loyalty requiringmore average impressions per switch). FIG. 13 also shows a graph of ahistogram of an estimate of the number of switches based on thedifferent loyalty bins. Though FIG. 13 depicts an estimate, the feecharged for the promotion may be based on whether a person accepts thepromotion (e.g., purchases the product) and what is the loyaltyattribute of the person.

Given the output of the promotion simulator 1230, the parameters of thepromotion may be modified and the simulation may be executed with themodified parameters. The selection of the modified selection module maybe performed manually, or may be performed by the modified parameterselection module 1240. For example, the duration of the promotion may bemodified and the simulation may be re-executed. The output of there-executed simulation may thereafter be analyzed manually orautomatically. Thus, the selection of parameters and promotionsimulation may iterate multiple times in order to select improved oroptimal parameters for a promotion.

Further, the output of the promotion simulator may be compared withactual results from past promotions. FIG. 14 shows a projection screenillustrating mechanism for viewing results of past promotions. As shownin FIG. 14, the actual results from promotions for periods may be shown,such as the current period (designated under Product Info as “CurrentPeriod”), the period prior to the current period (designated as “ChangeSince Last”), and two periods prior to the current period and the(designated as “Change Since 2 Past”). Further, a graph may be generatedwhich graphs the actual results based on any attributes of thecustomers. For example, a graph of the pantry loading attribute andbrand loyalty attribute is shown for the current period. The actualresults of past promotions may be examined along side the results fromthe promotion simulator, thereby comparing the actual versus predicted.Any one or all of the goals or parameters of the promotion may bemodified based on the comparison.

Finally, planning promotions based on individual models using thesimulation and optimization techniques discussed allows retailers toevaluate the costs of such promotions with much greater efficiency. Inaddition, it allows manufacturers to pay for palpable business resultsin terms of new customers, enhanced loyalty, and incremental revenue andlift. Any retailer utilizing the system then has a distinct advantage inbidding for promotional dollars from a manufacturer interested in payingfor direct performance.

III. Inventory Control

As discussed above, another aspect of a retail establishment's businessmay include inventory control. The retail establishment may wish toimprove its management of inventory to reduce the amount of inventorywhile still maintaining sufficient inventory for the retailestablishment's customers. Reducing the amount of items in inventory mayreduce costs.

Using the promotion simulator and the customer models described above,the retail establishment may predict the amount of a product categorythat will be purchased in an upcoming predetermined period, such as thenumber of items of a specific brand or a brand family. For example, aretail establishment wishes to estimate approximately the number of ½gallons of Minute Maid® orange juice purchased the next two weeks. Thisestimate may be calculated whether or not a promotion is run.

If a promotion is run during the predetermined period, the parameters ofthe promotion, used for input to the promotion planning module, may beknown. For example, the duration of the promotion, the discount for thepromotion, the customers targeted for the promotion, the customers nottargeted for the promotion, etc., may be known. Further, in simulatingthe estimate, all of the potential customers of the retail establishmentmay be accounted for in determining the estimate. To account for all ofthe potential customers, customer models may be used for each potentialcustomer of the retail establishment. Typically, at least a portion, butnot all, of a retail establishment's customers are in the retailestablishment's loyalty program. If a customer is in the loyaltyprogram, there may be sufficient data to generate a customer model forthe particular customer. For those customer's who are not in the loyaltycard program and therefore do not have a customer model, an averagecustomer model may be assigned to these customers. For example, if thereare 1000 potential customers for a retail establishment, a subset of the1000 potential customers (such as 800 customers) may have individualcustomer profiles. Customers who do not have an individual profile areassigned an “aggregate” customer model. The aggregate customer model maybe derived from data which is not used in other customer models. In thepresent example of 1000 customers, the data for the remaining 200customer may be used to statistically derive the aggregate customermodel. Therefore, the promotion simulation may be run with a customerprofile for each customer to determine the amount of the productcategory purchased due to the promotion and the amount of the productcategory purchased not due to the promotion. Typically, a customer maynot have an individual customer model where the customer does not havesufficient data to extrapolate the individual model (e.g., the customerjust started the loyalty program) or the customer does not wish toidentify himself or herself to the retail establishment.

Based on those customers who have individual customer models and thosecustomers who are ascribed the average customer model, a subset of thecustomers who receive the promotion and a subset of customers who do notreceive the promotion may be determined. After the subsets aredetermined, a simulation may be run with parameters describing the nexttwo weeks of any discounts or advertising for ½ gallons of Minute Maid®orange juice. The output of the simulation may include a number of unitsof ½ gallons of Minute Maid® orange juice sold due to the promotion anda number of units of ½ gallons of Minute Maid® orange juice sold not dueto the promotion. Based on this estimate, the inventory may becontrolled so that a sufficient amount of ½ gallons of Minute Maid®orange juice is in stock in the upcoming period.

If no promotion is run during the predetermined period, the promotionplanning module discussed above may still be used. The parameters usedfor input for the promotion planning module include the duration, whichmay be the predetermined period and the amount of promotion, which iszero. Further, purchases for all customers, regardless of brand loyaltyto the product category are sought. Therefore, the entire range of brandloyalty to the product category is input to the promotion simulator sothat all customer models are accounted for in the simulation. Moreover,each potential customer of the retail establishment may be accounted forusing either individual customer models or average customer models, asdiscussed above. Since all purchases for the product category aresought, all of the customer models available may be used for thesimulation.

Alternatively, the sub-model for the shopping list predictor may beused. As discussed above, the shopping list predictor may use variousstatistical analyses to determine a probability that a particularcustomer may purchase a product category. For example, based on analysisof previous customer transactions, the shopping list predictor may havea 0.8 probability that the customer will purchase one ½ gallon of MinuteMaid® orange juice. Given the customer's frequency of shopping attributeand give the shopping list sub-model, a prediction may be made for aspecific customer whether (and how much) the customer will purchase ofthe product category in the predetermined period. These calculations maybe performed for each customer with a customer model, and the number ofthe product category summed for all of the customers with customermodels. Further, for the customers of the retail establishment who donot have customer models, an average customer model may be used.Specifically, an average shopping list predictor sub-model may bederived for the product category for an average customer usingtransaction data for all customers who do not have a customer model(i.e., data for previous transactions for the product category for allcustomers who do not have a customer model). Using customer models forall of the customers of a retail establishment (i.e., using individualcustomer models for customers who have them and using average customermodels for customers who do not have individual models), the estimatefor the number of a product category purchased in a predetermined periodmay be estimated.

Additionally, the simulator and the customer models described above maybe used to determine the effect of removing an item or adding an item toa retail establishment. If an item is currently being sold by a retailestablishment, the simulator and the customer models may predict thepotential revenue lost or gained by removing the item. For example, thesimulator may suggest whether customers will purchase a lower or highermargin product, will stop purchasing other items at the retailestablishment, or will stop purchasing items altogether. Conversely, ifan item is not currently being sold by a retail establishment, thesimulator and the customer models may predict the potential revenue lostor gained by adding the item for sale by the retail establishment.

While various embodiments of the invention have been described, it willbe apparent to those of ordinary skill in the art that many moreembodiments and implementations are possible within the scope of theinvention. Accordingly, the invention is not to be restricted except inlight of the attached claims and their equivalents.

1. (canceled)
 2. A computer-implemented method of evaluating performanceof a shopping list predictor, the method comprising: accessing acustomer model that stores transaction data associated with a customer,wherein the transaction data describes previous purchases that were madeby the customer; determining, by one or more computers, at least oneperformance metric to be used for evaluating the shopping listpredictor; determining, from the transaction data, training data to beused for training the shopping list predictor and test data to be usedfor evaluating the performance of the shopping list predictor;generating, by the one or more computers and based on the training data,a predicted shopping list using the shopping list predictor; comparing,by the one or more computers, the predicted shopping list with the testdata to determine a match between product categories that were predictedby the shopping list predictor with product categories that wereactually purchased by the customer; determining a value of a performancemetric based on comparing the predicted shopping list with the testdata; and outputting, by the one or more computers, a measure ofperformance for the shopping list predictor based on the value of theperformance metric.
 3. The computer-implemented method of claim 2,wherein the at least one performance metric comprises at least one of astandard recall, a precision, an accuracy, or an f-measure.
 4. Thecomputer-implemented method of claim 2, wherein generating a predictedshopping list comprises predicting a probability that the customer willpurchase a product from a particular product category.
 5. Thecomputer-implemented method of claim 2, wherein generating a predictedshopping list comprises: determining, based on the training data, atleast one attribute to be used in generating a predicted shopping list;and generating, based on the determined at least one attributes, aprediction of a product category containing a product that the customerwill likely acquire on a shopping trip.
 6. The computer-implementedmethod of claim 5, wherein determining the at least one attributecomprises applying one or more rules to the training data to determinethe at least one attribute.
 7. The computer-implemented method of claim5, wherein determining the at least one attribute comprises applyingmachine learning to the training data to determine the at least oneattribute.
 8. A computer-implemented method comprising: accessingcustomer models that store transaction data associated with one or morecustomers; determining, by one or more computers and based on thetransaction data associated with the one or more customers, a predictedprobability that customers will purchase products belonging to a productcategory during a predetermined period of time; generating, by the oneor more computers and based on the predicted probability, a predictionof an amount of products in the product category that will be purchasedduring the predetermined period of time; and determining an amount bywhich an inventory of products in the product category can be reduced,based at least on the prediction of an amount of products in the productcategory that will be purchased during the predetermined period of time.9. The computer-implemented method of claim 8, further comprising:performing a promotion simulation based on the transaction dataassociated with the one or more customers; and determining, based onresults of the promotion simulation, an effect of promotions on theamount of products in the product category that will be purchased in thepredetermined period of time.
 10. The computer-implemented method ofclaim 9, wherein performing a promotion simulation comprises:determining parameters of a promotion, the parameters including at leastone of a discount for the promotion, customers targeted for thepromotion, or customers not targeted for the promotion.
 11. Thecomputer-implemented method of claim 10, wherein performing a promotionsimulation further comprises: identifying one or more potentialcustomers for whom a customer model is not available; associating withthe one or more potential customers an aggregate customer model that isstatistically derived from existing customer models and; performing thepromotion simulation based on transaction data that includes transactiondata obtained from aggregate customer models associated with the one ormore potential customers.
 12. The computer-implemented method of claim8, wherein generating a prediction of an amount of products in a productcategory that will be purchased comprises generating a prediction of anumber of items of a specific brand that will be purchased during thepredetermined period of time.
 13. The computer-implemented method ofclaim 8, further comprising determining a predicted change in revenue byremoving or adding products from the product category.
 14. Acomputer-implemented method of promotion planning, the methodcomprising: determining, by one or more computers, a customer modelcomprising a plurality of promotion attributes comprising at least oneof (i) a price sensitivity attribute indicative of sensitivity of acustomer to a product price, or (ii) a brand loyalty attributeindicative of loyalty of the customer to a product brand, the promotionattributes being derived from customer data comprising transaction dataassociated with the customer; predicting, by the one or more computers,a product of interest to the customer; and generating, by the one ormore computers, a promotional offer based on at least one of thepromotion attributes and the product of interest to the customer. 15.The computer-implemented method of claim 14, further comprising:determining the price sensitivity attribute based on a number of timesthe customer bought the product of interest in a particular store whenthe product of interest was priced at a particular price.
 16. Thecomputer-implemented method of claim 14, wherein the price sensitivityattribute is further indicative of sensitivity of the customer to pricesof a plurality of products belonging to a product category that includesthe product of interest.
 17. The computer-implemented method of claim14, wherein generating, by the one or more computers, a promotionaloffer based on at least one of the promotion attributes and the productof interest to the customer comprises: accessing the price sensitivityattribute for the product of interest; and determining, based on theprice sensitivity attribute, an amount of discount to be offered for theproduct of interest as part of the promotional offer.
 18. Thecomputer-implemented method of claim 14, further comprising: determiningthe brand loyalty attribute based on a brand loyalty score indicative ofa propensity of the customer to buy a specific brand given theavailability of that brand in a product category that includes theproduct of interest.
 19. The computer-implemented method of claim 18,further comprising: modifying the brand loyalty score based on apopularity of the specific brand to other customers or based on a priceof the specific brand relative to other brands.
 20. Thecomputer-implemented method of claim 19, wherein modifying the brandloyalty score based on a popularity of the specific brand to othercustomers comprises: reducing the brand loyalty score if the specificbrand is determined to be popular among other customers, and increasingthe brand loyalty score if the specific brand is determined to beunpopular among other customers.
 21. The computer-implemented method ofclaim 19, wherein modifying the brand loyalty score based on a price ofthe specific brand relative to other brands comprises: reducing thebrand loyalty score if the specific brand is determined to be lessexpensive than the other brands, and increasing the brand loyalty scoreif the specific brand is determined to be more expensive than the otherbrands.